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Abstract. This paper develops tests for inequality constraints of nonparametric re- 
£N| ■ grcssion functions. The test statistics involve a one-sided version of L p -type functionals 

of kernel estimators (1 < p < oo). Drawing on the approach of Poissonization, this 
paper establishes that the tests are asymptotically distribution free, admitting asymp- 
totic normal approximation. In particular, the tests using the standard normal critical 
Q-f values have asymptotically correct size and are consistent against general fixed alterna- 

tives. Furthermore, we establish conditions under which the tests have nontrivial local 
power against Pitman local alternatives. Some results from Monte Carlo simulations 
are presented. 
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I. Introduction 

Suppose that we observe {(Y-, X[ )'}f =1 that are i.i.d. copies from a random vector, 

(Y', X')' G R J x RA Write Y i = (Y u , ■ ■ •, Y M )' G R J and define mAx) = Ep^X; = x], 
CN ■ 

j — 1, 2, • • •, J. The notation = indicates definition. 

This paper focuses on the problem of testing functional inequalities: 

^ ! H : TJijix) < for all (x, j) G X x J, vs. 

Hi : rrij(x) > for some (x,j) e X x J, 

where X C R d is the domain of interest and J = {1, . . . , J}. Our testing problem 
is relevant in various applied settings. For example, in a randomized controlled trial, 
a researcher observes either an outcome with treatment {W\) or an outcome without 
treatment (Wo) along with observable pre-determined characteristics of the subjects 
(X). Let D — 1 if the subject belongs to the treatment group and otherwise. Suppose 
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that assignment to treatment is random and independent of X and that the assignment 
probability p = P{D = 1}, < p < 1, is fixed by the experiment design. Then the 
average treatment effect E(Wi — Wol-X" = x), conditional on X = x, can be written as 

E(Wi-W \X = x) =E 

where W = DW± + (1 — D)W$. In this setup, it may be of interest to test whether or 
not m{x) = E(Wi - W \X = x) < for all x. 

In economic theory, primitive assumptions of economic models generate certain testable 
implications in the form of functional inequalities. For example, Chiappori, Jullien, 
Salanie, and Salanie (2006) formulated some testable restrictions in the study of insur- 
ance markets. Our tests are applicable for testing their restrictions (e.g. equation (4) of 
Chiappori, Jullien, Salanie, and Salanie (2006)). Furthermore, our method can be used 
to test for monotone treatment response (see, e.g. Manski (1997)). For example, testing 
for a decreasing demand curve for each level of price in treatments and for each value of 
covariates falls within the framework of this paper. 

Our test statistic can also be used to construct confidence regions for a parameter 
that is partially identified under conditional moment inequalities. See, among many 
others, Andrews and Shi (2011a,b), Armstrong (2011), Chernozhukov, Lee, and Rosen 
(2009), Chetverikov (2012), and references therein for inference with conditional moment 
inequalities. 

This paper proposes a one-sided L p approach in testing nonparametric functional 
inequalities. While measuring the quality of an estimated nonparametric function by 
its Lp-distance from the true function has long received attention in the literature (see 
Devroye and Gyorfi (1985), for an elegant treatment of the L\ norm of nonparametric 
density estimation), the advance of this approach for general nonparametric testing 
seems to have been rather slow relative to other approaches, perhaps due to its technical 
complexity. 

Csorgo and Horvath (1988) first established a central limit theorem for the L p -distance 
of a kernel density estimator from its population counterpart, and Horvath (1991) in- 
troduced a Poissonization technique into the analysis of the L p -distance. Beirlant and 
Mason (1995) developed a different Poissonization technique and established a central 
limit theorem for the L p -distance of kernel density estimators and regressograms from 
their expected values without assuming smoothness conditions for the nonparametric 
functions. Gine, Mason and Zaitsev (2003: GMZ, hereafter) employed this technique 
to prove the weak convergence of an Z^-distance process indexed by kernel functions in 
kernel density estimators. 
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This paper builds on the contributions of Beirlant and Mason (1995) and GMZ to 
develop methods for testing ( 11. ip . In particular, the tests that we propose are studentized 
versions of one-sided L p -type functionals. We show that our proposed test statistic is 
distributed as standard normal under the least favorable case of the null hypothesis. 
Thus, our tests using the standard normal critical values have asymptotically correct size. 
We also show that our tests are consistent against general fixed alternatives and carry out 
local power analysis with Pitman alternatives. For the latter, we establish conditions 
under which the tests have nontrivial local power against Pitman local alternatives, 
including some n~ ^-converging Pitman sequences. 

Our tests have the following desirable properties. First, our tests do not require usual 
smoothness conditions for nonparametric functions for their asymptotic validity and con- 
sistency. This is because we do not need pointwise or uniform consistency of an unknown 
function to implement our tests. For example, a studentized version of our statistic can 
be estimated without need for controlling the bias. Second, our tests for (11.11) are dis- 
tribution free under the least favorable case of the null hypothesis where rrij(x) = 0, 
for all x G X and for all j G J and at the same time have nontrivial power against 
some, though not all, n _1,/2 -converging Pitman local alternatives. This is somewhat 
unexpected, given that nonparametric goodness-of-fit tests that involve random vec- 
tors of a mult i- dimension and have nontrivial power against n~ 1//2 -converging Pitman 
sequences are not often distribution free. Exceptions are tests that use an innovation 
martingale approach (see, e.g., Khmaladze (1993), Stute, Thies and Zhu (1998), Bai 
(2003), and Khmaladze and Koul (2004)) or some tests of independence (or conditional 
independence) among random variables (see, e.g., Blum, Kiefer, and Rosenblatt (1961), 
Delgado and Mora (2000) and Song (2009)). Third, the local power calculation of our 
tests for (II. ip reveals an interesting contrast with other nonparametric tests based on 
kernel smoothers, e.g. Hardle and Mammen (1993) and Horowitz and Spokoiny (2001), 
where the latter tests are known to have trivial power against n~ ^-converging Pitman 
local alternatives. Our inequality tests can have nontrivial local powers against n' 1 ^ 2 - 
converging Pitman local alternatives, provided that a certain integral associated with 
local alternatives is strictly positive. On the other hand, it is shown in Section H] that 
our equality tests have trivial power against ra _1//2 -converging Pitman local alternatives. 
Therefore, the one-sided nature of inequality testing is the source of our different local 
power results. This finding appears new in the literature to the best of our knowledge. 

The remainder of the paper is as follows. Section |2] discusses the related literature. 
Section |3] provides an informal description of our test statistic for a simple case, and 
establishes conditions under which our tests have asymptotically valid size when the 
null hypothesis is true and also are consistent against fixed alternatives. We also obtain 
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local power results for the leading cases when p — 1 and p = 2. In Section HI we make 
comparison with functional equality tests and highlight the main differences between 
testing inequalities and equalities in terms of local power. In Section El we report results 
of some Monte Carlo simulations that show that our tests perform well in finite samples. 
The proofs of main theorems are contained in Section [6j along with a roadmap for the 
proof of the main theorem. 



2. Related Literature 

In this section, we provide details on the related literature. The literature on hypoth- 
esis testing involving nonparametric functions has a long history. Many studies have 
focused on testing parametric or semiparametric specifications of regression functions 
against nonparametric alternatives. See, e.g., Bickel and Rosenblatt (1973), Hardle and 
Mammen (1993), Stute (1997), Delgado and Gonzalez Manteiga (2000), Horowitz and 
Spokoiny (2001), and Khmaladze and Koul (2004) among many others. The testing 
problem in this paper is different from the aforementioned papers, as the focus is on 
whether certain inequality (or equality) restrictions hold, rather than on whether cer- 
tain parametric specifications are plausible. 

When J — 1, our testing problem is also different from testing 

Hq : m(x) = for all x £ X, against 

Hi : m(x) > for all x G X with strict inequality for some x 6 X. 

Related to this type of testing problems, see Hall, Huber, and Speckman (1997) and Koul 
and Schick (1997, 2003) among others. In their setup, the possibility that m(x) < for 
some x is excluded, so that a consistent test can be constructed using a linear functional 
of m(x). On the other hand, in our setup, negative values of m(x) for some x are allowed 
under both Ho and H\. As a result, a linear functional of m(x) would not be suitable 
for our purpose. 

There also exist some papers that consider the testing problem in ( II. ip . For example, 
Hall and Yatchew (2005) and Andrews and Shi (2011a,b) considered functions of the form 
u i — y maxjtt, 0} p to develop tests for ( II. ip . However, their tests are not distribution free, 
although they achieve local power against some n~ 1,/2 -converging sequences. See also 
Hall and van Keilegom (2005) for the use of the one-sided L p -type functionals for testing 
for monotone increasing hazard rate. None of the aforementioned papers developed test 
statistics of one-sided L p -type functionals with kernel estimators like ours. See some 
remarks of Ghosal, Sen, and van der Vaart (2000, p. 1070) on difficulty in dealing with 
one-sided L p -type functionals with kernel estimators. 
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In view of Bickel and Rosenblatt (1973) who considered both L 2 and sup tests, a 
one-sided sup test appears to be a natural alternative to the L p -type tests studied in 
this paper. For example, Chernozhukov, Lee, and Rosen (2009) considered a sup norm 
approach in testing inequality constraints of nonparametric functions. Also, it may be 
of interest to develop sup tests based on a one-sided version of a bootstrap uniform 
confidence interval of g n , similar to Claeskens and van Keilegom (2003). The sup tests 
typically do not have nontrivial power against any n _ ^-converging alternatives, but 
they may have better power against some "sharp peak" type alternatives (Liero, Lauter 
and Konakov, 1998). 

Testing for inequality is related to testing for monotonicity since a null hypothesis 
associated inequality (respectively, monotonicity) can also be framed as that of mono- 
tonicity (respectively, convexity) of integrated moments. For example, Durot (2003) 
and Delgado and Escanciano (2011, 2012) used the least concave majorant operator to 
characterize their null hypotheses and developed tests based on the isotonic regression 
methods. 

Finally, we mention that there exist other applications of the Poissonization method. 
For example, Anderson, Linton, and Whang (2012) developed methodology for kernel 
estimation of a polarization measure; Lee and Whang (2009) established asymptotic null 
distributions for the Li-type test statistics for conditional treatment effects; and Mason 
(2009) established both finite sample and asymptotic moment bounds for the L p risk for 
kernel density estimators. See also Mason and Polonik (2009) and Biau, Cadre, Mason, 
and Pelletier (2009) for asymptotic distribution theory in support estimation. 

Among all the aforementioned papers, our work is most closely related to Lee and 
Whang (2009), but differs substantially in several important ways. First, we consider 
the case of multiple functional inequalities, in contrast to the single inequality case of 
Lee and Whang (2009). This extension requires different arguments (see, e.g. Lemma 
A7 in Section I6T21) and is necessary in order to encompass important applications such as 
testing monotonic treatment response and inference with conditional moment inequali- 
ties. Second, we extend the L\ statistic to the general L p statistic. Such an extension 
is not only theoretically challenging because many of the results of GMZ apply only to 
the Li statistic (See, e.g., Lemmas A3 and Lemmas A8 in Section 16^2]) . but also useful 
to applied econometricians because the L p -type test statistics with different values of 
p generally have different power properties. Third, regularity conditions are weaker in 
this paper than those in Lee and Whang (2009). In particular, we allow the underlying 
functions to be non-smooth, which should be useful in some contexts. We believe that 
none of these extensions are trivial. Therefore, we view these two papers as complements 
rather than substitutes. 
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The testing framework in this paper could be easily extended to testing stochastic 
dominance conditional on covariates in the one-sample case or in the program evaluation 
setup described in the introduction. For the latter setup, testing conditional stochastic 
dominance amounts to testing m(x,y) = E[l(Wi < y) — l(Wo < y)\X = x) < for all 
(x,y) G Xy, where Xy is the domain of the interest and W% and Wo, as before, are 
outcomes for treatment and control groups, respectively. Then a conditional stochastic 
dominance test can be developed by combining a density weighted kernel estimator 
of m(x, y) with a one-sided L p -type functional. However, it is not straightforward to 
extend our framework to general two-sample cases. This is because the propensity score 
P(D — 1\X — x) is unknown in general and has to be estimated to implement the test. 
See, for example, Lee and Whang (2009), Delgado and Escanciano (2011), and Hsu 
(2011) for testing conditional treatment effects, including testing conditional stochastic 
dominance, in general two-sample cases. 

3. Test Statistics and Asymptotic Properties 

3.1. An Informal Description of Our Test Statistics. Our tests are based on one- 
sided Lp-type functionals. For 1 < p < oo, let A p : R H- R be such that A p (v) = 



J x 

where Wj : R d — > [0, oo) is a nonnegative weight function. Let / denote the density 
function of X and define gj(x) = m,j(x)f(x). To construct a test statistic, define 



where K : R d i— y R is a kernel function and h a bandwidth parameter satisfying h — > 
as n — > oo. Our test statistic is a suitably studentized of version of Tj(gj n (x)ys. 

Note that we focus on values of x for which gj n {x) > through the use of A p (v). 
Thus, we expect that when Ho is true, a suitably studentized version of Tj(gj n ) is "not 
too large" for each j G J but that when H is false, it will diverge for some j G J . This 
motivates the use of a weighted sum of Tj(gj n ) as a test statistic. We require that at least 
one component of X be continuously distributed. If some elements of X are discrete, we 
can modify the integral in the functional above by using some product measure between 
the Lebesgue and counting measures. 



max{v, 0} p , v G R. Consider the following one-sided L p -type functionals: 
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We show in Section 13.21 that under weak assumptions, there exist nonstochastic se- 
quences CLj n G R, j G J , and a n G (0, oo) such that as n — > oo, 



under the least favorable case of the null hypothesis, where rrij(x) = 0, for all x G X 
and for all j G J . This is done first by deriving asymptotic results for the Poissonized 
version of the processes, {gj n (x) : x G X}, j G J ' , and then by translating them back 
into those for the original processes through the de-Poissonization lemma of Beirlant 
and Mason (1995). See Appendix 16.11 for details. 

To construct a test statistic, we replace a jn and a n by appropriate estimators to 
obtain a feasible version of T n , say, T n , and show that the limiting distribution remains 
the same under a stronger bandwidth condition. Hence, we obtain a distribution free 
and consistent test for the nonparametric functional inequality constraints. 

To provide a preview of local power analysis with Pitman alternatives in Section 13. 3[ 
suppose that J = 1 and p = 1, and the form of the local alternatives is g±(x) = g n Si(x) 
for some function Si(x), where g n is a sequence of real numbers that converges to 
as n — )• oo. Then (1) if J 5i(x)wi(x)dx > 0, our test has nontrivial power against 
sequences of local alternatives with g n oc n -1 ' 2 ; (2) if f x 5\{x)wi{x)dx = 0, our test has 
nontrivial power only against sequences of local alternatives for which g n — > at a rate 
slower than n _1//2 ; and (3) if f x Si(x)wi(x)dx < 0, our test is locally biased whether or 
not g n oc n -1 / 2 , although our test is a consistent test against general fixed alternatives. 

An alternative statistic is a max statistic such as m&Xj & j \n p / 2 h <yP ~ 1 ^ d / 2 T j(gj n ) — a>jn}, 
which we do not pursue in this paper since the "max" version of the test is not typically 
asymptotically pivotal. 

3.2. Test Statistics and Asymptotic Validity. Define Sj = {x E X : Wj(x) > 0} 
for each j G J , and, given e > 0, let Sj be an e-enlargement of Sj, i.e., Sj = {x + a : 
x G Sj, a G [— e,e] d }. For 1 < p < oo, let 



We introduce the following assumptions. 

Assumption 1: (i) For each j G J and for some e > 0, rj 2 {x) is bounded away from 
zero and rj,2 P +2 (x) is bounded, both uniformly in x G Sj. 

(ii) For each j G J, Wj(-) is nonnegative on X and < f x w S j(x)dx < oo, where 



(3.1) 




(3.2) 



r jj> (x) = -E[\Y ji \*\X i = x]f(x). 



se{l,2}. 

(iii) For e > in (i), <S? C X for all j G J. 
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Assumption 2: K(u) = U^ =1 K s (u s ), u = (u ir ■ -,u d ), with each K s : R ->■ R, 
s = 1, • • -,d, satisfying that (a) K s (u s ) = for all u s G R\[— 1/2, 1/2], (b) K s is of 
bounded variation, and (c) ||i^ s ||oo = su P Us eR \Ks{u s )\ < oo and f K s (u s )du s = 1. 



Assumption l(i) imposes that mi{rj^{x) : x G Sj} > and sup{rj >2p+2 (a;) : x G 
< oo for each j G Assumption l(ii) is a weak condition on the weight function. 
Nonnegativity is important since we develop a sum statistic over j. Assumption l(iii) is 
introduced to avoid the boundary problem of kernel estimators by requiring that Wj have 
support inside an e-shrunk subset of X. Note that Assumptions l(i) and (iii) require 
that Sj be a bounded set for each j G J . The conditions for the kernel function in 
Assumption 2 are quite flexible, except that the kernel functions have bounded support. 

Define for j,k G J and x G R d , 

' x - Xi 



Pjk,n\% ) 



P]ni x ) = 



Pjk{X) 



1 
1 



E 



E 



'J1 J 



Y l K l 

31 



h 



X 



X, 



h 



E [YjiY ki \Xi = x] f(x) / K 2 (u)du, and 



[u)du. 



p){x) = ElYftXt = x]f(x) J K 2 



Let Zi and Z 2 denote mutually independent standard normal random variables. We 
introduce the following quantities: 



(3.3) 



a jn = h~ d/2 p p jn (x)w j (x)dx ■ EA p (Zi) and 
Jx 

(T jk ,n = / qjkA^^jni^Plni^A^ki^dx, 



where qjk, P ( x ) = f[-i,i]<* Cov(A p ( J 1 -t 2 jk (x,u)Z 1 + t jk (x, w)Z 2 ), A p (Z 2 ))du and 

^ , ^_ Pjk(x) J K (x) K (x + u) dx 
3 ' pj(x)p k (x) jK 2 (x)dx 

Let S n be a J x J matrix whose (j, /c)-th entry is given by Ojfc in . Later we use S n to 
normalize the test statistic. The scale normalization matrix S n does not depend on 
x, and this is not because we are assuming conditional homoskedasticity in the null 
hypothesis, but because S n is constituted by covariances of random quantities that 
already have x integrated out. We also define S to be a J x J matrix whose (j, k)-th 
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entry is given by er,fc, where 



L 



qjkAx)p p Ax)p p k (x)w j {x)w k {x)dx. 



As for £, we introduce the following assumption. 

Assumption 3: £ is positive definite. 

For example, Assumption 3 excludes the case where and (j ^ k) are perfectly 
correlated conditional on X\ = x for almost all x with Wj = Wf.. 
The following theorem is the first main result of this paper. 

Theorem 1: Suppose that Assumptions 1-3 hold and that h — > and n~ 1 ^ 2 h~ d — > as 
n — > oo. Furthermore, assume that mj(x) = /or almost all x £ X and for all j G J7\ 



where a 2 = l'E n l, and 1 is a vector of ones. 

Note that when J = 1, a 2 takes the simple form of q p J ' pl^ l (x)w 2 (x)dx, where 



Then 



T n ^-J2 {n p / 2 h^ d / 2 T,(g jn ) - a jn } 4 N(0, 1), 





Cov(A p (y/l - t 2 (n)Z! + t(u)Z 2 ), A p (Z 2 ))du, and 



To develop a feasible testing procedure, we construct estimators of a jn 's and a 2 
follows. First, define 



as 



(3.4) 





We estimate aj n and o^,n by 
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where qjk, P ( x ) — J[-i,i]d Cov(A p ( J 1 -i? k (x,u)Zi + t jk (x, u)Z 2 ), A p (Z 2 ))du and 

£. fc ( x u \ = Pjk,n( X ) . / K ( X ) K ( X + U ) dx 
^ ' Pjn(x)f>kn(x) jK 2 (x)dx 

Note that EAi(Zi) = 1/V2n » 0.39894 and EA 2 (Zi) = 1/2. When p is an integer, 
the covariance expression in qjk, P {x) can be computed using the moment generating 
function of a truncated multivariate normal distribution (Tallis, 1961). More practically, 
simulated draws from 7L\ and Z 2 can be used to compute the quantities EA p (Z 1 ) and 
1jk, P {x) for general values of p. The integrals appearing above can be evaluated using 
methods of numerical integration. We define E n to be a J x J matrix whose (j, k)-th 
entry is given by a jk>n . 

Let a\ = l'Sn.l. Our test statistic is taken to be 

1 J 

(3.5) T n = — {n^-^T^n) - a jn } . 

Let Zi_ a ee $ _1 (1 — a), where $ denotes the cumulative distribution function of A r (0, 1). 
This paper proposes using the following test: 

(3.6) Reject H if and only if T n > Zi_ a . 

The following theorem shows that the test has an asymptotically valid size. 

Theorem 2: Suppose that Assumptions 1-3 hold and that h — > and n _1 / 2 /i~ M / 2 — > 
0, as n — > oo. Furthermore, assume that the kernel function K in Assumption 2 is 
nonnegative. Then under the null hypothesis, we have 

lim P{f n > zi„ Q } < a, 

n— >oo 

with equality holding if rrij(x) = for almost all x G X and for all j G J . 



Note that the probability of making an error of rejecting the true null hypothesis is 
largest when rrij{x) = for almost all x G X and for all j G J , namely, when we are in 
the least favorable case of the null hypothesis. 

The nonparametric test does not require assumptions for m^s and / beyond those in 
Assumption l(i), even after replacing a jn 's and a\ by their estimators. In particular, the 
theory does not require continuity or differentiability of / or m^'s. This is because we 
do not need to control the bias to implement the test. This result uses the assumption 
that the kernel function K is nonnegative to control the size of the test. (See the proof 
of Theorem 2 for details.) 



TESTING FUNCTIONAL INEQUALITIES 11 

The bandwidth condition for Theorem 2 is stronger than that in Theorem 1. This is 
mainly due to the treatment of the estimation errors in dj n and <r 2 . For the bandwidth 
parameter, it suffices to take h = c\n~ s with < s < l/(3d) for a constant c\ > 0. In 
general, optimal bandwidth choice for nonparametric testing is different from that for 
nonparametric estimation as we need to balance the size and power of the test instead of 
the bias and variance of an estimator. For example, Gao and Gijbels (2008) considered 
testing a parametric null hypothesis against a nonparametric alternative and derived 
a bandwidth-selection rule by utilizing an Edgeworth expansion of the asymptotic dis- 
tribution of the test statistic concerned. The methods of Gao and Gijbels (2008) are 
not directly applicable to our tests, and it is a challenging problem to develop a theory 
of optimal bandwidths for our tests. We provide some simulation evidence regarding 
sensitivity to the choice of h in Section [51 

According to Theorems 1-2, each choice of the weight functions Wj leads to an asymp- 
totically valid test. The actual choice of Wj may reflect the relative importance of 
individual inequality restrictions. When it is of little practical significance to treat indi- 
vidual inequality restrictions differently, one may choose simply Wj(x) = l{x G S} with 
some common support S. Perhaps more naturally, to avoid undue influences of differ- 
ent scales across Y^'s, one may use Wj(x) = a^ 2 w(x), for some common nonnegative 
weight function w(x), where 

&jj >n = % \ pf n {x)w 2 (x)dxJ e J, 
Jx 

where p* n (x) is given as in (13 .4p . Then ajj >n is consistent for (see the proof of 
Theorem 2), and just as the estimation error of a n in ( 13. 61) leaves the limiting distribution 
of T n under the null hypothesis intact, so does the estimation error of Oj 3 \ n - 

The following result shows the consistency of the test in (13. 6ft against fixed alternatives. 

Theorem 3: Suppose that Assumptions 1-3 hold and that h — > and n _1 / 2 /i~ 3d / 2 — > 0, 
as n — > oo. If Hi is true and Fj(gj) > for some j G J , then we have 

lim P{T n > zi_ a } = 1. 

n— >oo 

3.3. Local Asymptotic Power. We determine the power of the test in (13.61) against 
some sequences of local alternatives. Consider the following sequences of local alterna- 
tives converging to the null hypothesis at the rate n -1 / 2 , respectively: 

(3.7) Hs : 9j{x) = n~ 1 ^ 2 5j(x), for each j G J , 
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where Sj(-) 7 s are bounded real functions on H d . 

The following theorem establishes a representation of the local asymptotic power 
functions, when p G {1,2}. For simplicity of notation, let us introduce the following 
definition: for s G {1,2}, z G { — 1,0,1}, a given weight function vector w = (wi, Wj), 
and the direction 5 = (Si, - ■ -,Sj)', let r] s ^ z (w,5) = J2j=i fx Sj(x)pj(x)wj(x)dx, and let 

a 2 = I'm. 

Theorem 4: Suppose that Assumptions 1-3 hold and that h — > and n~^ 2 h~ 3d ^ 2 — > 0, 
as n oo. 

(i) If p = 1, then, under Hs, we have 

lim P{f n > z^ a } = 1 - <S>(zi^ a - r)i t0 (w, 5) /2a). 

n— >oo 

(ii) If p = 2, then, under H$, we have 

lim P{f n > z^ a } = 1 - $(zx_ Q - 771,1 (w, 5)/(a^j2)). 



Theorem 4 gives explicit local asymptotic power functions under H$, when p = 1 
and p = 2. The local power of the test is greater than the size a, whenever the "non- 
centrality parameter" (r]i t o(w, S) /2a in the case of p = 1 and 771,1 (w, 5)/ (ay/ix/2) in the 
case of p = 2) is strictly positive. For example, when J = 1 and p = 1 (or p — 2), the 
test is asymptotically locally strictly unbiased as long as [1$ = f x 5i(x)wi(x)dx > (or 
J x 5i(x)pi(x)wi(x)dx > 0). Notice that [1$ can be strictly positive even if Si(x) takes 
negative values for some x G X. Therefore, our test has nontrivial local power against 
some, though not all, n _1//2 -local alternatives. 

On the other hand, if the noncentrality parameter is zero, the test still has nontrivial 
power against local alternatives converging to the null at the n~ l / 2 h~ d / 4 rate, which is 
slower than n~ 1//2 . To show this, consider the following local alternatives: 

H* s : g,j(x) = n- xl2 h- d ^bj(x), for each j G J, 

where Sj(-)'s are bounded real functions as before. Theorem 4* gives the local asymptotic 
power functions against H$. 

Theorem 4*: Suppose that Assumptions 1-3 hold and that h — > and 
0, as n — > 00. 
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(i) If p = 1 and r] 10 (w,5) = 0, then, under Hg, we have 

lim P{f n > z±_ a } = 1 - $(zi_ Q - 772,-1 {w, 5) /a/87tc7). 

n— >oo 

(ii) If p = 2 and 771,1(10, 5) = 0, /;/ien ; under Hg, we have 

lim P{f n > Zl - a } = 1 - $(£i_ Q - r/ 2 , (w, 5)/2<r). 



If 771,0 (w, 5) = in the case of p = 1 or 771,1(10, 5) = in the case of p — 2, then the local 
power of the test is greater than the size a because the new noncentrality parameter 
in Theorem 4* is strictly positive. For example, when J = 1, we have rj2,-i(w,5) = 
f x 5l{x)pi 1 {x)wi{x)dx > (and rj2,o(w,5) = f x 5\{x)w\{x)dx > 0) for all 6±. Therefore, 
when 771,0 (w, 5) = or 771,1(10, 5) = 0, Theorem 4* implies that our test is strictly locally 
unbiased against the n~ l / 2 h~ d / 4 local alternatives Hg, though it has only trivial local 
power (= a) against the n~ l l 2 local alternatives H$. 

To explain the results of Theorems 4 and 4* more intuitively, consider the test statistic 
T n with J — 1, p = 2 and d = 1. For simplicity, take w(-) = 1. Let a = q 2 f x p\(x)dx and 
a n = h~ x l 2 J x p\ {x)dx ■ EA 2 (Zi). Let the alternative hypothesis be given by 

H* 5 :g(x) = n- 1 / 2 h- b 5 1 (x), 

where b = or 1/4. Consider the statistic T n with a n and d n replaced by their population 
analogues a n and a n , respectively, i.e., 

T n = — { nh 1/2 [ A 2 (g n (x)) dx - a n 1 

O-n I Jx J 

(3.8) = ^(/" A 2 (g n (x))dx- [ EA 2 (g n (x))dx\ 

&n [Jx Jx J 

It is easy to see that T n has the same asymptotic distribution as T n under the local 
alternative hypothesis. The first term on the right hand side of ( 13.81) converges in 
distribution to the standard normal distribution by the arguments similar to those used 
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to prove Theorem 1. Consider the second term in (13.81) . We can approximate it by 
— Ink 1 ' 2 / EA 2 {g n {x))dx-a r , 



Or, 



X 



= 1 ( / EA 2 ( fe -l/4 pi(x) V^[gn(x) - ^ w (x)] + n ^ h m EUx) ) dx _ an 
On [Jx \ Pl( X ) J 

(3.9) -a' 1 [ EA 2 (h~ 1/4 Pl (x)Z 1 +n 1/2 h 1/4 Eg n ( y x))dx-a- 1 a n 
Jx 

(3.10) 

~ a' 1 [ {EA 2 (h- 1 / 4 p 1 (x)Z 1 + h 1 / 4 ~ b 5 1 (x)) - EA 2 (h~ 1 ^ pi(ar)Zi)} dx 
Jx 

(3.11) 

~ h -b (^B J S 1 (x) Pl (x)dx^+h 1 / 2 - 2b (J- J 5 2 (x)d x y 

where (I3.9P follows from the Poissonization argument, (13. 10[) holds by n^^h^^Egnix) = 
^i/4-6 j § x (x — uh)K(u)du ~ h 1 ^ 4 ~ b Si(x), and (13.111) uses a Taylor expansion EA2(7^i + 
/i) - EA 2 (7Zi) ~ 20(0)^7 + $(0)/i 2 with 7 = /i- 1 / 4 ^^^) and fi = /i 1/4 - b 5i(x), where 
</>(■) and $(•), respectively, denote the pdf and cdf of the standard normal distribution. 
This approximation tells us that if J x 5i(x)pi(x)dx > 0, we can take b = so that it can 
achieve nontrivial power against n _1//2 alternatives, while if j x 8\(x)p\(x)dx = 0, then 
we should take b = 1/4 so that it has nontrivial local power against n~ l l 2 h~ 1 ^ local 
alternatives. Notice that, in the latter case, j x 5\{x)dx is always positive. 

It would also be interesting to compare local power properties of our test with that of 
Andrews and Shi (2011a). Unlike our test, the test of Andrews and Shi (2011a, Theorem 
4(b)) does not require J x 5i(x)pi(x)dx > 0, but excludes some n~ 1//2 -local alternatives. 
An analytical and unambiguous comparison between the two approaches is not straight- 
forward, because the test of Andrews and Shi (2011a) is not asymptotically distribution 
free, meaning that the local power function may depend on the underlying data gener- 
ating process in a complicated way. However, we do compare the two approaches in our 
simulation studies. 

When J — 1, thanks to Theorem 4, we can compute an optimal weight function that 
maximizes the local power against a given direction 5. See Stute (1997) for related 
results of optimal directional tests, and Tripathi and Kitamura (1997) for results of 
optimal directional and average tests based on smoothed empirical likelihoods. 

Define <J 2 (wi) = q p J x p 2 ^ l (x)wl(x)dx for J = 1. The optimal weight function (denoted 
by w*) is taken to be a maximizer of the drift term 77^0 (ii>i, 5i)/(7i(iui) (in the case of 
p — 1) or 771,1 (wi, Si)/a 2 (wi) (in the case ofp = 2) with respect to W\ under the constraint 
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that Wi > and j x Wi(x)p 2p (x)dx = 1. The latter condition is for a scale normalization. 
Let = max{5i,0}. Since p\ and w\ are nonnegative, the Cauchy-Schwarz inequality 
suggests that the optimal weight function is given by 



if p = 1, and 



(3.12) - 



3+(x)p- 3 (x) 

To satisfy Assumption l(iii), we assume that the support of Si is contained in an e- 
shrunk subset of X. With this choice of an optimal weight function, the local power 
function becomes: 



1 _ <f> / r _ Vf X ( 5 t) 2 ( x )Pi 2 ( x ) dx \ 
V'?2i"/2 



4. Comparison with Testing Functional Equalities 

It is straightforward to follow the proofs of Theorems 1-3 to develop a test for equality 
restrictions: 

(4.1) H : rrij{x) = for all (x,j) e X x J, vs. 

H\ : rrij(x) ^ for some (x,j) G x J. 

For this test, we redefine A p (v) = \v\ p and, using this, redefine T n in (13. 5p and a 2 . Then 
under the null hypothesis, 

f n 4iv(o,i). 

Therefore, we can take a critical value in the same way as before. The asymptotic 
validity of this test under the null hypothesis in ( 14. ip follows under precisely the same 
conditions as in Theorem 2. However, the convergence rates of the inequality tests and 
the equality tests under local alternatives are different, as we shall see now. 

Consider the local alternatives converging to the null hypothesis at the rate n~ 1 ^ 2 h~ d ^\ 

(4.2) H* 5 : gj (x) = n-^h-^djix), for each j G J, 

where (5j(-)'s are again bounded real functions on R rf . The following theorem establishes 
the local asymptotic power functions of the test based on T n . 



Theorem 5: Suppose that Assumptions 1-3 hold and that h — > and n 1 / 2 h M l 2 — > 0, 
as n — » oo. 
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(i) If p = 1, then under H$, we have 

lim P{f n > Zl _ a } = 1 - $0i_ Q - 7] 2> -i{w,5)/(V2^a)). 

n— >oo 

(ii) If p = 2, then under H$, we have 

lim P{f n > zi_ a } = 1 - $0i_ Q - ?72,o(w, ^)/cr)- 



Theorem 5 shows that the equality tests (on (14.11) ). in contrast to the inequality tests 
(on (11.11) ). have nontrivial local power against alternatives converging to the null at rate 
n~ l l 2 h~ d l i , which is slower than n^ 1 ^ 2 . This phenomenon of different convergence rates 
arises because A p is symmetric around zero in the case of equality tests, and it is not 
in the case of inequality tests. To see this closely, observe that in the case of p = 1, 
the power comparison between the equality test and the inequality test is reduced to 
comparison between E|Zi + — E|Zi| and E max{Zi + /i, 0} — Emax{Z 1; 0} for fi close 
to zero, where Zi follows a standard normal distribution with denoting its density. 
Note that we can approximate E|Zi + y\ — E|Zi| by {4>"(0) + 20(0) }/i 2 for \x close to 
zero, and approximate Emax{Zx + fi, 0} — EmaxjZ^O} by $(0)/x for \x close to zero. 
The smaller scale /x 2 in the former case arises because the leading term in the expansion 
of E|Zi + fi\ — E|Zi | around /x = disappears due to the symmetry of the absolute value 
function | • |. Therefore, the different rate of convergence arises due to our symmetric 
treatment of the alternative hypotheses (positive or negative) in the equality test, in 
contrast to the asymmetric treatment in the inequality test. 

Since 772,-1 (w, 5) and 772,0(^5 5) are always nonnegative, the equality tests are locally 
asymptotically unbiased against any local alternatives. In contrast, the terms r]i t0 (w,5) 
and 771,1 (w;, 5) in the local asymptotic power functions of the inequality tests in Theorem 
4 can take negative values for some local alternatives, implying that the inequality tests 
might be asymptotically biased against such local alternatives. This feature is not due 
to the form of our proposed inequality test, but is rather a common feature in testing 
moment inequalities. It is because the null hypothesis is given by a composite hypothesis 
and most of the powerful tests are not similar on the boundary and hence biased against 
some local alternatives. In principle, one can construct a test that is asymptotically 
similar on the boundary, but such a test has typically poor power. See Andrews (2011) 
for details. 

The test in Theorem 5 shares some features common in nonparametric tests that are 
known to detect some smooth local alternatives that have narrow peaks as the sample 
size increases. See e.g. Fan and Li (2000) and references therein. To see this closely, 
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consider a sequence of non-Pitman local alternatives of type: 

H s n '■ 9j( x ) = lnSj, n (x), for each j e J, 

where 7„ is a deterministic sequence and Sj^ n (x) is now allowed to change over n. For 
example, one may consider Sj^ n (x) to be a function with a single peak that becomes 
sharper as n becomes large, e.g. 5j tn (x) = Lj ((x — xo)/( n ) , where Lj (•) is a bounded 
function, xq G R d is a fixed point, and ( n , — > as n — > oo. By using the same arguments 
as in the proof of Theorem 5, we can show that the two-sided version of our test has 
nontrivial power against such local alternatives provided lim^oo nh d ^ 2r Y^rj2 t -i{w, 5j yn ) ^ 
(for p — 1) or lim^oo nh d l 2 j^ 772,0 (w, <Jj )n ) 7^ (for p = 2). However, since our main 
interest lies in testing functional inequalities, we will not pursue further local power 
properties of the equality test. On the other hand, it would also be interesting to see 
whether it would give an adaptive, rate-optimal test to take the supremum of our two- 
sided version of our test over a set of bandwidths, as in Horowitz and Spokoiny (2001). 
However, the latter study is beyond of the scope of this paper. 

As in Section T3.31 when J = 1, an optimal directional test under ( 14.2ft can also be 
obtained by following the arguments leading up to (13 . 1 2[) so that 



Similarly as before, let the support of 81 be contained in an e-shrunk subset of X . The 
optimal weight function yields the following local power functions: 



This section reports the finite-sample performance of the one-sided L\- and L 2 -type 
tests from a Monte Carlo study. In the experiments, n observations of a pair of random 
variables (Y, X) were generated from Y = m(X)+a(X)U, where X ~ Unif[0, 1] and U ~ 
N(0, 1) and X and U are independent. In all the experiments, we set X = [0.05, 0.95]. 

To evaluate the finite-sample size of the tests, we first set m(x) = 0. We call this case 
DGP0. In addition, we consider the following alternative model 

(5.1) m(x) = x(l — x) — c m 





where q p = Jj_ 1>1]d Gov - t 2 (u)Z 1 + t(u)Z 2 | p , \Z 2 \ p )du, for p e {1, 2}. 



5. Monte Carlo Experiments 
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where c m G {0.25,0.20,0.15,0.10,0.05}. We call these 5 cases DGPs 1-5. When c m = 
0.25 (DGP1), we have m(x) < for all x ^ 0.5 and m(x) = with x = 0.5. Hence, 
this case corresponds to the "interior" of the null hypothesis. In view of asymptotic 
theory, we expect the empirical probability of rejecting Hq to converge to zero as n gets 
large. When c m < 0.25 (DGPs 2-5), we have m(x) > for some x. Therefore, these four 
cases are considered to see the finite-sample power of our tests. Two different functions 
of cr(x) are considered: a(x) = 1 (homoskedastic error) and a(x) = x (heteroskedastic 
error) . 

The experiments use sample sizes of n = 50, 200, 1000 and the nominal level of a = 
0.05. We performed 1000 Monte Carlo replications in each experiment. In implementing 
both L x and L 2 -type tests, we used K(u) = (3/2)(l - (2u) 2 )I(\u\ < 1/2) and h = 
Ch x sx x ra -1 / 5 , where 1(A) is the usual indicator function that has value one if A 
is true and zero otherwise, Ch is a constant and sx is the sample standard deviation 
of X. To check the sensitivity to the choice of the bandwidth, eight different values 
of Ch are considered: {0.75,1,1.25,1.5,1.75,2,2.25,2.50}. Finally, we considered the 
uniform weight function: w(x) = 1 and the inverse standard error weight function: 

W(x) = l/p n (x). 

To evaluate the relative performance of our test, we have also implemented one of test 
statistics proposed by Andrews and Shi (2011a), specifically their Cramer- von Mises- 
type (CvM) statistic with both plug-in asymptotic (PA/Asy) and asymptotic generalized 
moment selection (GMS/Asy) critical values. Specifically, countable hypercubes are used 
as instrument functions, and tuning parameters were chosen, following suggestions as in 
Section 9 of Andrews and Shi (2011a). 

Empirical rejection probabilities are plotted in Figures HE 8 different solid lines 
in each panel correspond to our test with 8 different bandwidth values. 2 dotted lines 
correspond to the test of Andrews and Shi (2011a) with PA and GMS critical values. For 
each case, the test with the GMS critical value gives slightly higher rejection probabilities 
than that with the PA critical value. When Hq is true and m(x) = (DGP0), the 
differences between the nominal and empirical rejection probabilities are small. When 
Hq is true and m(x) is (15.11) with c m = 0.25 (the interior case DGP1), the empirical 
rejection probabilities are smaller than the nominal level and become almost zero for 
n = 1000. 

When Hq is false and the correct model is (15.11) with c m < 0.25 (DGPs 2-5), the 
power of both the L\ and L2 tests is increasing as c m gets smaller. This finding is 
consistent with asymptotic theory since it is likely that our test will be more powerful 
when f x m(x)w(x)dx is larger. Note that in DGPs 3-5, (c m = 0.15,0.10,0.05), the 
rejection probabilities increase as n gets large. This is in line with the asymptotic theory 
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in the preceding sections, for our test is consistent for these values of c m . However, the 
rejection probabilities are quite small even with n = 1000 for c m = 0.20 (DGP 2). This 
is not surprising given that our test can be biased, as shown in Section 13.31 To further 
investigate the issue of bias associated with J x m(x)w(x)dx, we carried out an additional 
simulation with m(x) = sin(27rx). It turns out that rejection probabilities were almost 
one across different values of the bandwidth for both weight functions and for both 
homoskedastic and heteroskedastic errors. This seems to be consistent with Theorem 4* 
in Section 13.31 We do not report full details of additional simulation results for brevity. 

Simulation results for the CvM statistics are similar to our test statistics. More 
precisely, in Figured] (the homoskedasticity case), the L\ test with both weight functions 
seems to be more powerful than Andrews and Shi's test, whereas in Figure SJ their test 
appears to be more powerful than the L2 test with the uniform weight. However, for 
most cases, power performances are comparable between each other. Note further that 
there is little difference between PA and GMS critical values for the CvM statistic of 
Andrews and Shi (2011a). This is due to the fact that m(x) is either flat or has a 
maximum at a single point. We note that the results are not very sensitive to the 
bandwidth choice for our tests. Finally, regarding the choice of the weight function, we 
would like to recommend the inverse standard error weight since it seems to perform 
better than the uniform weight in simulations. 

6. Proofs 

This section begins with a roadmap for the proof, where the roles of technical lemmas 
and main difficulties are explained. Then we state the lemmas and present the proofs 
of the theorems. 

6.1. The Roadmap for the Proof of Theorem 1. The proof of Theorem 1 follows 
the structure of the proof of the finite-dimensional convergence in Theorem 1.1 of GMZ. 

Under the condition of Theorem 1 that rrij(x) = for almost all x G X and for 
all j G J , we can show that Egj n (x) = for almost all x in the support of Wj from 
some large n on. This means that by letting vj n (x) = gj n (x) — E^j n (x) and ( n (A) = 
Sj=i Ia ^p( v jn(x))wj(x)dx with some A C X, we can write T n as 

p/2u(p-l)d/2 

(6.1) {(n(X\A) - BUX\A)} 

p/2u(p-l)d/2 

+ {UA)-E( n (A)}. 

The main part of the proof of Theorem 1 establishes asymptotic normality for the second 
term and asymptotic negligibility for the first term when A is chosen to nearly cover X . 
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The proof of asymptotic normality employs the Poissonization method of GMZ which 
prevents us from choosing A to cover X entirely. This makes the proof intricate. The 
asymptotic arguments for both terms of (16. ip require that a n is an asymptotically stable 
quantity. Hence we begin by dealing with a n . 



Step 1: In Lemma A7, we show that given appropriate A C X, a n (A) — > cr(A) > 
as n — y oo, for some <r{A) > 0, where (T n (A) is a n except that the integral domains 
°f o".jfc,n are restricted to A. To prove the convergence, we choose the domain A to be 
such that nonparametric functions pjk, n that constitute a n are continuous and uniformly 
convergent on this domain. That we can choose such A to be large enough is ensured by 
Lemma Al. The proof is lengthy, the main step being the approximation of covariances 
of Poissonized sums. For this approximation, we use a type of a Berry- Esseen bound for 
sums of independent random variables due to Sweeting (1977). This bound is restated in 
Lemma A2. Since the bound involves various moments of random quantities, we prepare 
these moment bounds in Lemmas A4 and A5. 



Step 2: We establish that the second term in (16. ip is asymptotically standard normal 
when A nearly covers X . First, we use Lemma A6 to show that the second component 
in (16. ip is asymptotically equivalent to 

p/2 h (p-l)d/2 

(6.2) {( n (A) - E( N (A)} , 



where ( N (A) = Yfj=i J A ^p(v jN (x))wj(x)dx, v jN (x) = g jN {x) - Eg 



jn\% ) 



(6.3) 9jN(x) 



X - Xi 



1 / 

i=l x 



and N is a. Poisson random variable with mean n and independent of all the other 
random variables. Then consider 

nP/ 2 h^ d / 2 {CN(A) -EC N (A)} 



(6.4) S n (A) 



where cr^(A) = ^2j=i Sfe=i a jk,n{A) and ajk >n (A) is <Jjk,n with the integral domain re- 
stricted to A. Note that the numerator of S n (A) is based on the Poissonized version 
vjn{x) so that when we cut the integral in Civ(^4) into integrals on small disjoint domains 
and sum them, this latter sum behaves like a sum of independent random variables. In 
Lemma A9, we construct this sum and apply the CLT to obtain asymptotic normality for 
S n (A). Then in Lemma A10, using the de- Poissonization lemma of Beirlant and Mason 
(1995), we deduce that the conditional distribution of S n (A) given N = n converges to 
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a standard normal distribution. (This lemma requires the set X\A to stay nonempty.) 
Since this conditional distribution is nothing but the distribution of (16. 2ft . we conclude 
that the second term in (16.11) is asymptotically standard normal. However, this sequence 
of arguments so far presumes that a n is an asymptotically right scale, which means that 
<j n should be based on the Poissonized version vjn(x) not on the original one Vj n (x). 

Step 3: It remains to deal with the first term in (I6.ip . Since cr n (A) is close to o~(A) > 
by Step 1 for large samples, it suffices to show that the quantity 

n P/2 h ( P -i)d/2 {(n ( X \ A) _ E( n (X\A)} 

is asymptotically negligible for large n and large set A. This is accomplished by Lemma 
A8, which again uses moment bounds of Lemmas A4 and A5. Since Wj is square inte- 
grable, if we can take A C X such that J X , A Wj(x)dx is small, the asymptotic negligibility 
of the first component in (16. ip follows by Lemma A8. Lemma 8 extends to Lemma 6.2 
of GMZ from p — 1 to p > 1. This generalization is necessary since the majorization 
inequality of Pinelis (1994) used in GMZ is not directly applicable in the general case 
with p > 1. 

Step 4: Finally, we approximate E( n (A) in the second component in (16. ip by an estimable 
quantity, Yljej a jn i n Theorem 1. This step is done through Lemma A6. The lemma is 
adapted from Lemma 6.3 of GMZ, but unlike their case of Li-norm, our case involves 
the one-sided L p -norm with p > 1. For this modification, we use the algebraic inequality 
of Lemma A3. This closes the proof of Theorem 1. 

6.2. Technical Lemmas and the Proof of Theorem 1. We begin with technical 
lemmas. The lemmas are ordered so that lemmas that come later rely on their preceding 
lemmas. 

The first statement of the lemma below is a special case of Theorem 2(b) of Stein 
(1970) on pages 62 and 63. The second statement is an extension of Lemma 6.1 of 
GMZ. 

Lemma Al: Let J(-) : H d — > R be a Lebesgue integrable bounded function and H : 
R d — > R be a bounded function with compact support S. Then, for almost every y G R d , 

/ J(x)H h (y — x) dx — > J{y) I H{x)dx, as h — » 0, 
Jn d Js 

where H h (x) = H(x/h)/h d . 

Furthermore, suppose that J = J \J(z)\dz > 0. Then for all < e < J = f \J(z)\dz, 
there exist M > 0, v > and a Borel set B of finite Lebesgue measure m(B) such 



22 



LEE, SONG, AND WHANG 



that B c [-M + v, M - v] 



a 



Jr 



\[-M,M]' 



\J{z)\dz > 0, f B \J(z)\dz > J - e, J i 



continuous on B, and 



sup 

y&B 



J(x)H h (y — x) dx — J(y) / H(x)dx 



0, as /i 0. 



Proof: The first statement is a special case of Theorem 2(b) of Stein (1970) on pages 
62 and 63. The second statement can be proved following the proof of Lemma 6.1 of 
GMZ. Since J is Lebesgue integrable, the integral J R d\[_ MM ] d \ J( z )\dz is continuous in 
M and converges to zero as M — > oo. We can find M > and v > such that 



L 



\J(z)\dz = e/8 and 



R d \[-Af,M]° 



|J(z)|dz = e/4. 



R d \[-M+u,M-i;] d 



The construction of the desired set B C [— M + u,M — f] d can be done using the 
arguments in the proof of Lemma 6.1 of GMZ. ■ 



The following result is a special case of Theorem 1 of Sweeting (1977) with g(x) = 
min(x, 1) (in his notation). See also Fact 6.1 of GMZ and Fact 4 of Mason (2009) for 
applications of Theorem 1 of Sweeting (1977). 



Lemma A2 (Sweeting (1977)): Let Z e R fc be a mean zero normal random vector 
with covariance matrix I and {Wi\^ =x is a set of i.i.d. random vectors in R fc such 
that EWi = 0, EWjW/ = /, and E||Wt|| r < oo, r > 3. Then for any Borel measurable 
function ip : H k — > R such that 



sup 



\<f{x) - <P(0)\ 



6R fc 1 + | |x| | r min(| |x| |, 1) 



< oo, 



we have 



E 



E [p(Z)] 



Ms) - <P(0)\ 



< ci sup . . - 

,xeR fc 1 + IfII mm (lRl> l )J lV n 



u) v (Z; ^|E||^|| 3 



E||iyj| 3 + 



1 



(r-2)/2 



n 



+c 2 E 

where c±, c<i and C3 are positive constants that depend only on k and r and 
w v (x;e) = sup {\<p(x) - <p(y)\ : y G R fc , ||x - y\\ < e} . 
The following algebraic inequality is used frequently throughout the proofs. 
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Lemma A3: For any a, b e R, let a + = max(a, 0) and b + = max(6, 0). Furthermore, 
for any real a > 0, if a = 0, we define \a] = 1, and if a > 0, we define \a] to be the 
smallest integer greater than or equal to a. Then for any p > 1, 

mzx{\a p + -bl\,\\a\ p - \b\ p \} < 2p\a - b\ ( ^ ^ ~ ^ ' \a - b\ ^~ k \b\ k 

V fc=0 

rp-ii , , 

E(p—l)k , 

k=0 

for some C > t/jai depends only on p. 




Proof : First, we show the inequality for the case where p is a positive integer. We 
prove first that ||a| p — \b\ p \ has the desired bound. Note that in this case of p being a 
positive integer, the bound takes the following form: 

k=0 

When p = 1, the bound is trivially obtained. Suppose now that the inequality holds 
for a positive integer q. First, note that using the mean-value theorem, convexity of the 
function f(x) = \x\ q for q > 1, and the triangular inequality, 

|| a |«+i_| 6 | 3 +i| < (g + l)|a-fe| SUPa6[0)1] (a|a| + (l-a)|fo|) 9 

< (q+ l)|a - 6|su Pae[0il] (a\a\ q + (1 - 

< (g+l)|o-6|(||o|«-|6H + 2|6|«). 
As for ||a| 9 — \b\ q \, we apply the inequality to bound the last term by 

(? + l)|o-6| [2 q "£^\a-br k \b\ k + \bA 
\ k=o ' J 

= 2 J2^±^\a-br k+1 \b\ k . 

k=0 

Therefore, by the principle of mathematical induction, the desired bound in the case of 
p being a positive integer follows. 

Certainly, we obtain the same bound for \a p + — b^l when p = 1. When p > 1, we 
observe that by the mean-value theorem, 

K-V+l < p\a-b\ {\a\ p - 1 + l&r 1 ) 

< p\a - b\ (Ha^- 1 - \b\ p ^\ + 2|6| p ~ 1 ) . 
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By applying the previous inequality to ||a| p_1 — |6| p_1 |, we obtain the desired bound for 
|a+ — 6+| when p is a positive integer. 

Since the bound holds for any positive integer p, let us consider the case where p 
is a real number strictly larger than 1. Again, we first show that ||a| p — |6| p | has the 
desired bound. Using the mean- value theorem as before and the fact that \a + 6| < 
2 1 - 1 / 3 (\a\ s + |6| s ) 1/s for all s G [1, oo) and all a, 6 G R, we find that for u= \p - 1] , 

\\a\ p -\b\ p \ < p\a-b\(\a\ p - 1 + \b\ p ~ 1 ) 

< p\a-b\ 2 l - {p - l)/u (\a\ u + |6|") {p " 1)/u 

< p\a - b\2 l - {p ~ l)/u (\\a\ u - \b\ u \ + 2\b\ u ) ip ~ 1)/u . 

Since u is a positive integer, using the previous bound, we bound the right-hand side by 

6 | 2 i-(p-i)/« j 2 ^ ^[| a _ h\ u ~ k \b\ k + 2\b\ u 



p \a 



k=0 



Consolidating the sum in the parentheses, we obtain the wanted bound. 
As for the second inequality, observe that 

_ ( P -i)/r P -ii 

2p\a-b\ ( l ^'^zil!|a-6|^ 1 l- fc |6| fc 

k=0 

< C max |a-6| p " i;{(p " 1)/r ^ 11} |6| fe < C V |a-6| p " fc{(p " 1)/rp - 11} |6| fc , 
fc G {o,i,..,r P -il}' 

for some C > that depends only on p. We can obtain the same bound for \a p + — 6+| 
by noting that |a+ — 6+| < p \a — b\ + |6| p_1 ) and following the same arguments 

afterwards as before. ■ 



Define for j G J, 

, r > 1. 

Lemma A4: Suppose that Assumptions and 2 hold and h — > as n — > oo. Then 

for e > in Assumption l(i), there exist positive integer n and constants Ci,C2 > 
such that for all n > n Q , all r G [1, 2p + 2], and all j G J , 

< ci < M x€S e/2p 2 jn (x) and 

sup s e/2k jntr (x) < c 2 < oo. 



(6.5) 



h~ d E 



Y 3l K 



x 



X,, 



h 
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PROOF: Since h — > as n — > oo, we apply change of variables to find that from large 
n on, 



illf ,„ Pin fa) 



e/2 



illf T^ E 



Y 2 K 2 



x - X 



> inf E \Y?\X = x] f{x) [ K 2 (u) du > c u 

xeS i J[-l/2,l/2]d 

for some C\ > by Assumptions l(i) and 2. Similarly, from some large n on, 

sup kj ntr (x) < sup E [|Yj-i| r |X = x] f(x) / \K (u) \ r du < oo, 
„. C W 2 ' xess J 



by Assumptions l(i) and 2. 



Define for each j G J, 



N 

x - X 



i=l V ' 

where iV is a Poisson random variable that is common across j G J ', has mean n, and 
is independent of {(Yji,Xi) : j G J}^i- Let for each j G JT, 

Uj„(x) = - Eg jn (x), and U/jv(x) = ^jv(^) - Eg jn (x). 

We define, for each j E J , 

(6.6) £ jn (x) = j— and 

Pjn{x) 

E l<Nl {YjiK ((x - Xt)/h) - E (Y Jt K ((x - Xi)/h))} 
VjJx) = — = 



EiYZKHix-Xj/h)} 

where iVi denotes a Poisson random variable with mean 1 that is independent of {(Yji, Xi) : 
j G J}^i- Then, Var(Vj n (x)) = 1. Let vj^(x), i — 1, • • -,n, be i.i.d. copies of Vj n (x) 
so that 

1 n 

(6-7) U^^-^E^)' 

Lemma A5: Suppose that Assumptions 1 (%)(%%%) and 2 hold and h — > as n — > oo and 

limsup n ^ 00 n- r/2+1 /i (1 - r/2)d < C7, 
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for some constant C > and for r G [2, 2p + 2]. Then, for e > in Assumption l(iii), 
sup E[\V jn (x)\ r ] < d/i (1 - r/2)d and sup E[|& n (:r)n < C 2 



'2> 

where C\ and C2 are constants that depend only on r. 



Proof : For all x G Sj /2 , E[V 2 n {x)} = 1. Recall the definition of k jn , r (x) in (lfT5I) . Then 
for some Co, Ci > 0, 

(6.8) sup E |^„(x)r < C sup fb 2 n > r }*\ < C^-^ d , 

by Lemma A4, completing the proof of the first statement. 

As for the second statement, using (16. 7p and applying Rosenthal's inequality (e.g. 
(2.3) of GMZ), we deduce that for positive constants C 3 , C4 and C 5 that depend only 
on r, 

sup E[|£ in (x)n < C 3 sup max{(EV^(x)Y^,n- r / 2+1 E\V Tn (x)\ r } 

< C A max{l,C 5 n~ r / 2+1 h^- r ^ d } 

by (16. 8p . By the condition that limsup n _ >00 n~ r / 2+1 /i^ 1 ~ r / 2 ^ < C, the desired result 
follows. ■ 



The following lemma is adapted from Lemma 6.3 of GMZ. The result is obtained by 
combining Lemmas A2-A5. 



Lemma A6: Suppose that Assumptions 1 and 2 hold and h — > and n l l 2 h d — > as 
n — > 00. Then for any Borel set A C R d and for any j G J7", 



/ {n p/2 /i (p - 1)d/2 EA p (^ n ,(x)) -/i- d/2 p^(x)EA p (Z 1 )}w j (x)c/x 0. 
Proof : Recall the definition of £j n (%) in (16. 6p and write 

nP /2^( P -l)d/2 EAp( ^ 7v(x)) _ / l -^/2 p P n ( x ) EAp ( Zl ) 

= h^fUx) {EA p fe n (i)) - EA P (Z X )} . 
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In view of Lemma A4 and Assumption l(ii), we find that it suffices for the first statement 
of the lemma to show that 



(6.9) 



sup \EA p (£ jn (x)) - EAp(Zi)| = o{h d l 2 ). 



By Lemma A5, sup^g^ E|y in (x)| 3 < Cbr d l 2 for some C > 0. Using Lemma A2 and 



taking r = max{p, 3} and Vj^(x) = Wi, and A p (-) = <£>(■), we deduce that 



(6.10) 



sup \EA p (£ jn (x)) - EAp(Zi) 



< dn" 1 ' 2 sup E \V jn (x)\ 3 + C 2 n-^ 2)/2 sup E \V jn \ 



x 



+C 3 sup E 



C 

u Av I Zi; — =E \VjJx 



n 



for some constants C s > 0, s = 1,2,3. The first two terms are o(h d ^ 2 ). As for the last 
expectation, observe that by Lemma A3, 

fp-il 



E 



Q 

u Av I Zj; -=E |K-„(x 



< 



(p-i)fc 

CT (-^E|^(x)| 3 ) E|Zi|*. 



The last sum is 0(n x / 2 /i d / 2 ) = o(h d ^ 2 ) uniformly over x E Sj, completing the proof of 

We consider the second statement. Let vj^\x), k — 1, • ■ -, n, be i.i.d. copies of 

^(^)-E(F,^(^)) 



E[^(^)]-(E 



so that Var(Vj* (x)) = 1. Observe that for some constants C\,C<i > 0, 



(6.11) 



sup E 



sup 



3/2 



< C*hT d l 2 , 



where bj n (x) = h d E [YjiK ((x — X^)/h)\. The last inequality follows by Lemma A4. 
Define 

V nh d Vj n (x) 



Pj„[x) 



where p 2 n (x) = nh d Var(vj n (x)). Then £j n (x) = ^= X)fc=i Vn Using Lemma A2 and 
following the arguments in (16.10P analogously, we deduce that 

sup |EA P g n (x)) - EApCZOl = o(/i d / 2 ). 
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This leads us to conclude that 

/ {n p l 2 h^ d ' 2 VA p (v, n (x)) - /i^ /2 p^ n (x)EA p (Z 1 )} w 3 (x)dx = o(l). 

J A 

Now, there exists n such that for all n > n , sup x( zs j h d b 2 n (x) < c±/2, where c\ > is 
the constant in Lemma A3. Observe that for all n > no, 

sup h~ d/2 \p p jn (x) -ff jn (x)\ 
xeSj 

= sup h- d ' 2 \(p 2 n (x) - h d b 2 n {x)Y 12 - {p%{x)Y' 2 \ 

< sup P -h d l 2 b 2 n {x) • max { (p 2 n (x) + c 1 /2) p/2 " 1 , (p 2 n (x) - c^)^ 1 } . 
By Lemma A4, the last term is 0(h d ^ 2 ) = o(l). This completes the proof. ■ 

Recall the definition: p 2 (x) = E[Y^|X; = x]f(x) J K 2 {u)du. Let 

&jk,n(A) = n p h^ p ~^ d I l Cov (A p (vjN(x)), A p (vkN(z))) Wj(x)wk(z)dxdz, and 



A J A 



(6.12) 

a jk {A) = I q jktP (x)p p {x)p p k {x)w j {x)w k (x)dx, 



A 

where we recall the definition: 

Qjk,p\X 



Cov (Ap(yjl - t 2 jk (x,u)Z 1 + t jk (x, u)Z 2 ), A P (Z 2 ) j du. 



Now, let (Zi n (x), Z2n(z)) G R 2 be a jointly normal centered random vector whose co- 
variance matrix is the same as that of (Cjn( x ), Ckn( z )) f° r a U x, z G R d . We define 



r jkjn (A) = / / g jk!n (x,u)\ jk , n (x,x + uh)dudx, 

J A J[-l,l] d 

where 

Xjk,n(x,z) = p p jn (x)p p kn {z)w j ( y x)w k (z)l A (x)l A (z), and 
gjk,n(x,u) = Cov (Ap(Z ln (x)), A p (Z 2n (x + uh))) . 

The following result generalizes Lemma 6.5 of GMZ from a univariate X to a multi- 
variate X. The truncation arguments in their proof on pages 752 and 753 do not apply 
in the case of multivariate X. The proof of the following lemma employs a different 
approach for this part. 



Lemma A7: Suppose that Assumptions 1 and 2 hold and let h — > as n — > oo satisfying 
limsup„_ ) . 0O ri~ r/2+1 /i (1 ~ r/2)d < C for any r G [2, 2p + 2] for some C > 0. 
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(i) Suppose that A C Sj D <Sfc any Borel set. Then 

<r jkjJl (A) = T jkj7l (A) + o(l). 

(ii) Suppose further that A has a finite Lebesgue measure, Pj(-)p k {-) an d w j(') w k(') are 
continuous and bounded on A, and 

(6.13) sup \pi >n {x) — pi{x)\ — > 0, as n — » 00, for I £ {j, fc}. 

Taen, as n — >■ 00, r^ ^A) = + o(l), ana 1 aence /rom (i), 

a jfci „(A) -> a-jfe(A). 
PROOF: (i) By change of variables, we write o~j kjTl (A) = fj kjn (A), where 



Tjk, n (A) =11 Cov (A p (£ jn (x)),A p (£ kn (x + uh))) X jk ,n(x, x + uh)dudx. 

J A J[-l,l] d 

Fix Ei £ (0, 1] and let c{e\) = (1 + E\) 2 — 1. Let rji and r) 2 be two independent random 
variables that are independent of ({Y^Xj : j £ ,7"}°^, iV), each having a two-point 
distribution that gives two points, {y^ei)} and {— \/c(ei)}, the equal mass of 1/2, so 
that Er/i = Er/ 2 = and Var (7/1) = Var(r7 2 ) = c(ei). Furthermore, observe that for any 
r > 1, 

(6-14) EM' = i|c( £l )r/ 2 + ||c(^)r/ 2 < Cef, 

for some constant C > that depends only on r. Define 

3W X ) = \ + £ and £Ld x + = Y+7i ' 

Note that Var(q n l (x)) = Var{C kn>2 {x + uh)) = 1. Let (^ n (x), Z^ n (x + u/i)) be a 
jointly normal centered random vector whose covariance matrix is the same as that of 

(&i(z).&, a (* + « / 0) for a11 G Rd x t- 1 ' 1 !" Define 




^fc,n(^) = / / Cotj (A p (^ nil (x)),A p (^ ni2 (x + n/i))) A iM (x,x + M/i)dMa , x, 
-i,i] d 



r 7fcn(^) -II Cov(A p (Z 1 l n (x)),A p {Z^ n (x + uh)))X jktn (x,x + uh)dudx. 
JaJ[-i,iY 

Then first observe that 

|rj fc) „(A) -f\ n (A)| < / / |A\ 1 (x,u)|A ifcin (x,x + u/7)a'Mdx 

+ 11 \A 71 jkn2 (x,u)\X jk>n (x,x + uh)dudx, 

JAJ[-l,l] d 
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where 

A J fcl n,iM = EA p (^ n (x))EA p ^ kn (x + uh)) 

-EA p ($ n>1 (x))EA p (e kn)2 (x + uh)) and 

A ; fe ,n l2 M = EA p (^ n (x))A p ^ kn (x + uh)) 

-EA p (Zl tl ( X ))A p ($ n)2 (x + uh)). 

Since for any a, b G R, |a+ — b p + \ < p\a — b\ (|a| p_1 + we bound |Aj fcn2 (x,n)| by 

+pe oeux + «/o - o + «/oi (ie fe n(x + uh)r i + i&^x + u/or 1 ) 

= A ln (x,w) + i4 2n (x,tt), say. 
As for Ai n (x, u), 

A ln ( X , u) < P (e ou*) - ^.i^rau^r 1 + lo^r 1 ) 2 ]) 172 
x (e [ian(^+^)r p ]) 1/2 . 

Define 

' (p+l)/(p-l) ifp> 1 
2 if p = 1, 

and g = (1 — l/s) _1 . Note that by Holder inequality, 

e dm*) - ^mu^r 1 + i&itor 1 ) 2 ] 
< (e dm*) -0a(*)i 2 1) 1/9 (e [au^r 1 + le^i^r 1 ) 25 ]) 175 - 

Now, 

E -O,!^)! 29 ] = (l + eO _aff E [|ei^„(x) - V i\ 2q ] 

< 2 2 ^ i (i+e 1 )~ 2 M^ E o^)r f/ ] +E[kii 29 ]}- 

Applying Lemma A5 and (16.141) to the last bound, we conclude that 

sup E - < TTTTW ^ ^ 

for some constants C\,C<i > 0. Using Lemma A5, we can also see that for some constants 

c 3 ,c 4 >o, 

supEfdOn^l^ + l^^r 1 ) 25 ] <C 3 

XGcS, 
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and from some large n on, 

sup sup E [\^ kn {x + uh)\ 2p ] < sup E [\£ kn (x)\ 2p ] < C7 4 , 

for e > in Assumption l(iii). Therefore, for some constant C > 0, 

sup sup Ai n (x, u) < Cy/Ei. 
«e[-i,i] d xeSjHSk 

Using similar arguments for A2 n {x, it), we deduce that for some constant C > 0, 

(6.15) sup sup \A v jkn2 (x,u)\ < C^e[. 

«e[-i,i] d xeSjHSk 

Let us turn to A v jk n l (x, u). We bound |Aj fc n ^(x, m)| by 

pe [\tu*)-&M\ d^r 1 + i^wr 1 )] E[ie ta (x+«fc)H 
+ P E + «h) - o + «/oi (iw* + u/or 1 + ieL,i(^ + ^r 1 )] eii^i 

Using similar arguments for A^ kn2 (x, it), we find that for some constant C > 0, 

(6.16) sup sup \A v jknl (x, u)\ < Cy/el. 

ue[-i,i] d xeSjns k 

By Lemma A4 and Assumption l(ii), there exist n > and Ci,C 2 > such that for all 
n > n , 



x)\ p ] 



(6.17) / / \jk. n {x,x + uh)dudx 

J A J[-l,l] d 



< Ci Wj(x)wk(x + uh)dudx 

J a J[-i,i] d 



< C 2 ^J J Wj(x)dx^J J J w\[x + uh)dudx < oo. 

Hence 

\rj k ,n( A ) - Tl n (A)\ < C 5v /e7 / / \ jk)n (x,x + uh)dudx < C 6y /el, 

JAJ[-l,l]d 

for some constants C5 > and C§ > 0. 

Since the choice of E\ > was arbitrary, it remains for the proof of Lemma A7(i) to 
prove that 

(6-18) \T» ktn {A)-T jhtn {A)\=o{l), 
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as n — > oo and then E\ — > 0. For any x £ Sj fl Sk, 



1 n 

* 1=1 



[x,u), 



where Wn (x, u)'s are i.i.d. copies of W n (x, u) = (g jn (x), gfc„(x + uh))' with 



£ i<JVl YjJC ((x - Xi)/h) - E [YjiK ((x - Xi)/h)] 



+ 77! W(l+£i). 



h*/* Pjn (x) 

Using the same arguments as in the proof of Lemma A5, we find that for j £ J~, 
(6.19) sup E [|g^(a:)| 3 ] < Ch~ d/2 , for some C > 0. 

Let S ln be the 2x2 covariance matrix of (£j n 2 {x + it/i))'. Define 

A n » = ApdE^lOV^wla), t; £ R 2 , 

where [a]j of a vector a £ R 2 indicates its j-th entry. There exists some C > such 
that for all n, 



(6.20) 



sup 



sup 

ueR 2 :||2-u||<<5 





A n:P (v) - A„ iP (0) 




1 + 1 


\v\ 


2p+2 m in{ t> 


,1} 



< C and 



A„ )P (^) - ~K, P {u) d${z) < CS for all 5 £ (0, 1]. 



The correlation between Cl n i( x ) anc ^ £fcn,2( x + * s ec i ua l to 

E + «*)] = £ [-(1 + eO" 2 , (1 + 



Hence, as for W^fou) = E^W^x, w), by fl6TT9"j) . 
(6.21) sup E||H/»(x,n)|| 3 



< d(l - (E[e^ 1 (x)e fe \ 2 (^ + «^)]) 2 )" 3/2 {sup^E[|g jn (z)| 3 ] + sup xe5| E[|g fcn (x)| 3 ]} 

< d(l - (1 +£i)- 4 r 3/2 {su PxeiS| E[|g jn (x)| 3 ] +sup xeiS ,E[|g fcn (x)|' 

< C 2 (l - (1 + e 1 )- 4 )~ 3/2 h- d/2 , for some Ci, C 2 > 0, 
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so that n 1//2 sup x65 . n(S E||W"„ (x, w)|| 3 = 0(n x l 2 h d l 2 ). By Lemma A2 and following 



i^- 



the arguments in (16.101) analogously, 



O {n- l / 2 h- d ' 2 ) = o(l), 



sup 



EA 



where Z%(x, u) = S ln 1 ^ 2 (Zj 7 n (x), Z\ n [x + uh))' . Certainly by (I6.14p and Lemma A5, 

Cov(A p (Zl(x)),A p (Zl(x + uh))) 
< y /E\Z? n (x)\'Py/E\ZZ n (x + uh)\ 2 v < C, 



for some C > that does not depend on e x . Using (I6.17p . we apply the dominated 
convergence theorem to obtain that 



(6.22) 



\r; ktn {A)-^ hin (A)\=o{l) 



as n — > oo for each e\ > 0. 

Finally, note from (I6.15P and (I6.16P that, for all x G A and all u G [—1, l] d , 

Cov(A p (Zl(x)),A p (Zl(x + uh))) 
= Cov{A p {Z ln {x)), A. p {Z 2n {x + uh))) + o(l), 

where the o(l) term is one that converges to zero as n — > oo and then E\ — > 0. Therefore, 
by the dominated convergence theorem, 

KJA)-T jkin (A)\=o(l), 



as n — > oo and then — > 0. In view of (I6.22p . this completes the proof of (16.181) and, 
as a consequence, that of (i). 



(ii) Define tj kjTl (x,u) = 
e jk (x,u) 



E(t,jn{x)t,kn{x + Uh)) 
1 



h d 



E 



x - XA fx - Xi 



u 



and 



/ K(z)K(z + u)d* 



jK 2 (u)du 

By Assumption l(i), and Lemma A4, for almost every x G A and for each u G [—1,1]' 

1 1 



tjk,n(,%i w) 



E 



(6.23) 



Pjn{x)pkn{x + ^) ^ 
Cjk,n(%] U,) 

Pjn(x)p kn (x + uh) pj(x)p k (x + uh) 



YjiY ki K ( ) If ( - 



h 



h 



e jk (x,u) 



+ o(l) = t jk (x,u) + o(l), 
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where we recall that tj k (x, u) = ej k (x, u)/(pj(x)pk(x)) by the definition of tj 



By (EI3D, 



Tj k ,n(A) = 9jk,n( X i u )\k{.X, X + uh)dudx + o(l), 

J A J[—l,l] d 

where Xj k (x,z) = p p (x)p k (z)wj(x)w k (z)lA(x)iA( z )- By (I6.23p . for almost every x G A 
and for each u G [—1, l] d , 

gjk,ni x i u ) 9jk(x, u), as n -»■ oo, 



where gj k (x,u) = Cov(A p (yJl — t? k (x, ufL\ + tj k (x, n)Z 2 ), A P (Z 2 )). Furthermore, since 
Pj(')Pk{') an d Wj(-)wk{-) are continuous on A and A has a finite Lebesgue measure, we 
follow the proof of Lemma 6.4 of GMZ to find that gj k ,n{x, u)Xj k (x, x + uh) converges in 
measure to gj k (x, u)Xj k (x, x) on A x [—1, l] d , as n — )• oo. Using the bounded convergence 
theorem, we deduce the desired result. ■ 

The following lemma is a generalization of Lemma 6.2 of GMZ from p = 1 to p > 1. 
The proof of GMZ does not carry over to this general case because the majorization 
inequality of Pinelis (1994) used in GMZ does not apply here. (Note that (4) in Pinelis 
(1994) does not apply when p > 1.) 

Lemma A8: Suppose that Assumptions 1 and 2 hold. Furthermore, assume that as 
n — > oo, h — )• 0, n~ x l 2 h~ d — > 0. Then there exists a constant C > such that for any 
Borel set A C R d and for all j G 3 ', 



limsup n ^ 0O E 



n P/2 h ( P -i)d/2 / { Ap (^ n ( x ))_E[A p (^ n (x))]}^(x)rfx 



A 



< C J Wj(x)dx + C J J w"j(x)dx. 



Proof : It suffices to show that there exists C > such that for any Borel set A C R 
Step 1: B [\n p / 2 h^ d / 2 J A (A p (v jn (x)) - A p (v jN (x))) wj^dx]] <C J A w j (x)dx, 

Step 2: E [\np/ 2 h^ d ' 2 j A (A p (v jN (x)) - E [A p (v jN (x))}) Wj (x)dx\] < C^Jj A w 2 (x)dx 
and 

Step 3: n p l 2 h^ d l 2 \J A (EA p (v jN (x)) - E [A p (v jn (x))]) Wj {x)dx\ as n — > oo. 



Indeed, by chaining Steps 1, 2 and 3, we obtain the desired result. 
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Proof of Step 1: For simplicity, let 
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uj n (x) = E 



Vn,ji(x) — 



2 2 / X Xj 
31 \ h 



E 



^jnix) 



Y 3l K 



x- Xj 
h 



E 



x - Xj 
h 

x — X, 
~h~ 



and 



We write, if N = n, Ya=n+i = °; and if ^ > n ; E"=jv+i = ~ Ein+r Usin S tnis 
notation, write 



JV 



V jn{ X ) n j 1 d' S ^2i^ n ^ 

i=l 



1 

i=N+l 



Now, observe that 

JV 



1 V— 

=^yiK,ii(a;Kn(a;) 



/v 



i=i 



jY:■ ^J fY 



i=l 



x - Xi 



h 



Y 3l K 



9jn(x) - -^E 



Vnh d 

+Vnh d (- — — 
\ n 



E 

x — X, 



x - Xj 
h 



h 



h d 



E 



YjiK 



x - Xi 



V nh d VjN(x) + V nh d 



n-N 



n 



h d 



E 



YjiK 



x - X, 



Letting 



Vjn[X) 



Sj n {X) 



= \/n 



n-N 
n 

n 



h d 



E 



YjiK 



x - Xi 



and 



r nh d ^ 

nn i=N+l 



ji \XjUj n \yXj , 



we can write 

(6.24) V nh d v jn (x) = Vnh d v jN (x) + (Vh^r] jn (x) + s jn (x)). 

First, note that for some constant C > 0, 



(6.25) 



sup u 2 jn (x) < Ch d , 



xS>S, 



from some large n on, by Lemma A4. Recall the definition of Pj n {x) '■ Pj n ( x ) 
\J nh d Var(vj n (x)) and note that 



P?„(x) = pUx) - h d b 2 jn (x) = h d uUx). 
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As in the proof of Lemma A5, there exist n , C\ > and C 2 > such that for all n > n , 
(6.26) 



Ci > sup Jp 2 jn (x) - h d b 2 n {x) = sup p jn 



x) 



> inf p jn (x) > inf JpUx) - h d bUx) > C 2 . 



V nh d 



v jN {x) 



Using (lfT25l) . (KM . and (lfT26l) . we deduce that for some C u C 2 , C 3 , and 4 > 0, 

n P/2 h (p-i)d/2 f ( Ap ( Vjn ( x ))-A v (v jN (x)))w j (x)dx 

J A 

JA\ V Pin(^)/ V Pjn{X 

Wj(x)dx 



Pjn(X) 



p-1 



Pin (a;) 



Wj(x)dx 



+C,h~ d ' 2 



< C 4 \Vjn(x) 
J A 



Sjn \X) 



Pjn[X) 



V^ d ^4 



Pjn\X) 



+c 3 



Sjn \X ) 



U>jn\X) 



Pjn \X 
p-1 

+ 

p-1 



p-1 



Vnh d 



VjN{X) 



Vnh d ^ 



x 



Pjn[X) 
p-l N 



Wj(x)dx 



Pjn{X) 



Vnh d 



Pjn\X) 

v jN (x) f 



Wj(x)dx 



Pjn{X) 



Wj(x)dx 



= A ln + A 2n , say. 
To deal with A\ n and A 2n , we first show the following: 



Claim 1: swp xeS . Efofjx)] =0(1) 



Claim 2: sup x6iS E[\s jn (x)/u jn (x)\ 2 } = o(l). 



Claim 3: sup xe5 . E[|Vr^^jv(a;)/p iri (x)| 2p - 2 ] =0(1). 



Proof of Claim 1: By Lemma A4 and the fact that E\n~ 1/2 (n - N)\ 2 = 0(1), 



sup E [rfijx)] < E 

x&Sj 



n- N 



n 



sup 

x£Sj 



h d 



E 



x - Xj 



0(1). 



Proof of Claim 2: Note that 



(6.27) 



Vnh d 



Sjn (^0 
^jni^x) 



N 



x) 



i=n+l 
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3 



Certainly Var(V n ji(x)) = 1. As seen in ( 16. lip . sup^^E | V n ji(x) | < Ch d l 2 for some 
C > 0. Similarly, 

sup E iK^x)! 4 < } fhlhM _ < ch~ d , 



h 2d (pjjx) - h d b 2 n (x)Y 



for some C > 0. Hence by Lemma l(i) of Horvath (1991), for some C > 0, 



E ( ] 



2 



-C\E\N-n\ 1/2 E\V nJl (x) 



\ 3 + E\V n)ji (x) 



Note that E\N - n\ = 0{n 1 ' 2 ) and E\N - n\ 1/2 = 0{n l ' A ) (e.g. (2.21) and (2.22) of 
Horvath (1991)). Therefore, there exists C > such that 



f^hd^y] <^ r n l/2 + n l/4 /r< i/2 + /r <n 
\ u jn {x) J 

Since rr l l 2 hr d -> 0, sup x65 E[(s in (x)/u jn (x)) 2 ] = o(l). 



sup E 



Proof of Claim 3: By (IBTSj) . Lemmas A3-A4, and ( I6.26p . we have 



sup E 



V nh d 



v jN (x) 



Pjn[X) 



2p-2 




Pjn{x) 


2p-2 


< 


= sup 










Pjn{x) 







V nh d v 



2p-2 N 



•jn{x) 



Pjn(x) 



< C, 



for some C > 0. This completes the proof of Claim 3. 



Now, using Claims 1-3, we prove Step 1. Let pLj(A) = J A Wj(x)dx. Since h^ p 1 ^ d ^ 2 
0(1) when p — 1, and \/a + b < y/a + for any a > and 6 > 0, 



E [A ln ] < C / E 



1^0)1 



p-1 



v jN {x) 



Pjn{X) 



P-1 



< C / u i (A)supE 



l%n(^)| 



p-1 



Pjn{x) 



Wj(x)dx 
p-i n 



< 



2C^(A)x su P E[^ n (x)] 



x sup E 



xG>S, 



1/2 



2p-2 



1/2 



sup E 



2p-2 



l/2> 
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Certainly, as in the proof of Lemma A5, 



(6.28) 



sup E 



V in (X 



Pjn{X) 



2p~2 



<c, 



for some constant C > 0. Hence using Claims 1 and 3, we conclude that E [Ai n ] < 
C/j,j(A) for some C > 0. As for A2 n , similarly, we obtain that for some C > 0, 



E [A 2n ] < C / E 



Sj n {X) 



Pjn{X) 



p-l 




p-l\ " 


+ 






Pjn{x) 





Wj(x)dx 











2~ 


x 


sup E 




Sj n {X) 






^jnixj 





sup E 



2p-2 



1/2 



1/2 



sup E 



2p-2 



l/2> 



By Claims 2 and 3 and f)6.28p . E [A 2n ] = o(l). Hence the proof of Step 1 is completed. 



Proof of Step 2: We can follow the proof of Lemma A7(i) to show that 



E 



K jn {A) + o(l), 



n p/2 h (p-i)d/2 / (| Ui7V ( x )|f_E[|^(x)n)^(a;)(ix 

where Kj„,(A) = J A J, j ^ r jn (x, -u)A jn (x, x + uh)dudx, 

X jn (x,z) = ^W^W^Ww^jU^l^l^^jaJid 
r jn (x,u) = Cov (\Z jnA (x)\ p , \Z jnjB (x + uh)\ p ) , 

with (Zj n ^(x), Zj n ^{x + it/i))' G R 2 denoting a centered normal random vector whose 
covariance matrix is equal to that of (£j n (x),£j n (x+uh))'. By Cauchy-Schwarz inequality 
and Lemma A5, 



sup rj n (x,u) < sup v/E |Z jni ^(x)| 2p E \Zj nt g(x + w/i)| zp < oo. 



|2p 



Furthermore, for each u G [—1, l] c 



Xj n (x, x + uh)dx < \ / / w"j(x)dx 



w^(x)dx. 



Since J s? Wj(x)dx < oo for some £ > (Assumption 1(h)), we find that as h — > 0, the 
last term converges to f A Wj(x)dx. We obtain the desired result of Step 2. 



Proof of Step 3: The convergence above follows from the proof of Lemma A6. ■ 
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Let C C R d be a bounded Borel set such that 

a = P{X e R d \C} > 0. 

For any Borel set A G C, let 



C n (A) = y~] / A p {v jn {x))w j {x)dx and 



Cat(A) 



V/A p (^(x)K(x)^ 

.7=1 ^ 



We also let <7*(A) = £/ =1 EjLi ^A A )^ and ^O 4 ) = E/=i ELi M^ 4 )- We define 
where 



U n = 



{y 1 {Xi G C} - nP {X G C}\ , and 



LEMMA A9: Suppose that Assumptions 1 and 2 hold. Furthermore, assume that as 
n ->• oo, h ->• 0, and n~ 1/2 h~ d -»■ 0. lei A C C 6e stzc/i i/iat cr 2 (.4) > 0, a = P{X G 
R d \C} > 0, Pj(-) 's and iWj(-) 's are continuous and bounded on A, and condition in (16 . 1 3[) 
is satisfied for all I — 1, ■ • -, J. Then, 

(S n (A),U n ) 4 (Z 1 ,v / T^Z 2 ). 

Proof : First, we show that 

(6.29) Cov(S n (A),U n ) -+0. 
Write 

n p/2 h (p-l)d/2 J r 

Cov (S n (A), U n ) = — } / Cov (A p (v jN (x)), U n ) w j (x)dx. 

°n(A) fr[JA 

It suffices for (16.291) to show that 

(6.30) Cov {vP' 2 hP d ' 2 {C N {A) - E{ N (A)}, U n ) = o(h d / 2 ), 
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since cr 2 (A) -> a 2 (A) = J2j=x Yfk=i a jk( A ) > by Lemma A7. For any x e Sj, 



I \J nh d v jN 



x 



\ pUx) ' y/p{XeC} 



where (<# (x) , U W ) 's are i.i.d. copies of (Q n (x), U) with 



Qn{x) 



h d ' 2 p jn {x) 



and 



U = 



J2 i<N i{ Xt eC}-P{XeC} 



VP{XeC} 

Uniformly over x G Sj, 

(6.31) r n (x) = E [Q n (x)U] = 0{h d ' 2 ) = o(l), 

by Lemma A4. Let (Z ln , Z 2n )' be a centered normal random vector with the same 
covariance matrix as that of (Q n (x), U)'. Let the 2 by 2 covariance matrix be £ n ,2- 
Since £)fc=i ^ and Z 2n have mean zero, we write 

COV (A p \-L Q^(x)\ , J2 U(k) ^ - C ° V ^ ^ » ^n) 



E 



fc=i 

Define A n , p (u) = VP^i) pj/ 2 u] 2 , u G R 2 There exists some C > such that for 
all n > 1, 



E [A p Z 2n ] = A„(x), say 



sup 







(v) - A n , p (0) 




1 + 1 


\v\ 


\ p+1 min{||u| 


,1} 



< C and 



sup \^n, P i z ) ~ ^n, P ( u )\ d&(z) < CS, for all 5 e (0, 1]. 

ueR 2 :||,z-«||<5 

Letting Wn (x) = £^ 2 • (Qn\x),U^)' , observe that using (16.311) and following the 
arguments in (16.211) . from some large n on, for some C > 0, 

E||^f)(x)|| 3 = E||S,- 1 2 /2 (gW(x),^))'|| 3 

= E[{tr(s; i 1 / 2 (gW(x), u^)'(Q^(x), £/ (fc) )s; 2 /2 )} 3/2 ] 

< C(l - r 2 (x))" 3 / 2 E [|Qn(a:)| 3 + |£/| 3 ] < C/^/ 2 . 
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Hence, by Lemma A2, 



EA n , p £ J - EA n , p ( Z n 



sup = sup 

xeSj xeSj 

= 0{n- l ' 2 h~ d / 2 )=o(h d ' 2 ), 

~ 1/2. 

where Z n = S n2 (Zi n , Z<2 n )'. This completes the proof of (I6.30p and hence that of 

(E2H!). 

Now, define 

J 

A n (x) = nPl 2 h^- 1)d / 2 Y,{KM^)) -nA P (v jN (x))]} Wj (x). 

3=1 

Following Mason and Polonik (2009), we slice the integral j x A n (x)dx into a sum of a 
1-dependent random field. Let C be as given in the lemma. Let Z d be the set of <i-tupes 
of integers, and let {R n ,i : i G Z d } be the collection of rectangles in R d such that R n ,\ = 

[^n,ii) ^n,ii] X • • • • X [Cl n> 

id' ^n,id]) where ij is the j-ih entry of i, and h < b n \ 3 — a n ij < 2h, 
for all j = 1, • • -,d, and two different rectangles R n ,i and R n j do not have intersection 
with nonempty interior, and the union of the rectangles R n> i, i G Z^, cover C, from some 
sufficiently large n on, where Z^ be the set of <i-tuples of integers whose absolute values 
less than or equal to n. 

We let B n i = Rn^nC and Z n = {i G Z^ : B n \ ^ 0}. Then B n i has Lebesgue measure 
m(B n j) bounded by C\h d and the cardinality of the set X n is bounded by C^/i - ^ for 
some positive constants C\ and Ci- Define 

a- hn = — -r-r [ A n (x)dx and 

u i>n = ^j^liXjGB^y-nPiXjGB^. 
Then, we can write 

S n (A) = ^2 a i,n and U n = u i>n . 

Certainly V r ar(5' re (A)) = 1 and it is easy to check that Var(U n ) = 1 — a. Take /ii, /x 2 G R 
and let 

From (I6^9|) . 

\ieZ n / 
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Since (J r n (A) = <J r (A) + o(l), r > 0, by Lemma A7 and m{B n ^) < Ch d for a constant 
C > 0, we take r G (2, (2p + 2)/p] and bound 

<(A)^E| ai ,„r 

< CsupE|A n (x)| r V ( / / I l Bni (u,v,s)dudvds] 

x&A ~ \Ja J A J A J 



r/3 



where 1b(u, v, s) = l{u G -B}l{f G -B}l{s G £>}. Using Jensen's inequality, we have 



sup E \A n {x) | r < dn^h^- 1 ^ 2 sup E 



»r p ^j(^) 



< C 2 n rp/2 /i r(p - 1)d/2 max sup E |u iiv (x)| rp 
i<j'<JxeAnSj 

for some C\,Ci > 0. As for the last term, we apply Rosenthal's inequality (see. e.g. 
Lemma 2.3. of GMZ): for some constant C > 0, 



n rp/2 h r(p-l)d/2 ^ g \VjN(x) 



I 



xeAnSi 



< Ch r ^ d ' 2 sup f^E 



sup 

xeAnSi 



1 



-Ch r( - P ~ 1)d/2 



sup 



:E 



x — Xj 



rp/2 



By Lemma A4, the first term is 0(h~ rd / 2 ) and the last term is 0(^1-^/2^-^/2-^/2+^ _ 
Hence we find that 



iex™ 



= Cardinality of X n x O (m(5 n , ji ) r / i - r ' i/2 {l + n 1 -^ /2 /T r * /2+<i }) 

for any r G (2, (2p + 2)/p], because n~ 1 / 2 h~ d — >■ 0. Therefore, as n — > oo, 

E|a iire | r for any r G (2, (2p + 2)/p\. 

iex n 

Also, arguing similarly as in (6.56) of GMZ, we can show that Xaex„ E|wi )rt | r — > 
as n — y oo for any r G (2, (2p + 2)/p\. Since Xj's are common across different j's, the 
sequence {2/i,n}iex„ is a 1-dependent random field (see Mason and Polonik (2009)). The 
desired result of Lemma A9 follows by Theorem 1 of Shergin (1993) and the Cramer- 
Wold device. ■ 
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Lemma A10: Suppose that the conditions of Lemma A9 are satisfied, and let A C R d 
be a Borel set in Lemma A9. Then, 

nP /2 h ( P ~W2 {U A)-EUA)} d 

— > N(0, 1), as n — > oo. 

o- n (A) 

Proof: The conditional distribution of S n (A) given N = n is equal to that of 

n p/2 h (p~l)d/2 J r 

TT\ 2^ / { A p( v jn{x)) -'EA p (v jN {x))}w j (x)dx. 

Using Lemma A9 and the de-Poissonization argument of Beirlant and Mason (1995) 
(see also Lemma 2.4 of GMZ), this conditional distribution converges to N(0, 1). Now 
by Lemma A6, it follows that 

n p/2 h (p-i)d/2 J2 [ {EA p ( VjN (x)) - EA p (t; in (x))} Wj (x)dx -)• 0, 
as n — > oo. This completes the proof. ■ 



Proof of Theorem 1 : Fix e > as in Assumption l(iii), and take n > such that 
for all n > uq, 

{x-uh:xE Sj, u E [-1/2, l/2) d } C S] C X for all j E J. 
Since we are considering the least favorable case of the null hypothesis, 
E[YjiK((x - Xi)/h)]/h d = / mj(x - uh)K(u)du = 0, for almost all x E Sj, 

J[~l/2,l/2] d 

for all n > n and for all j E J . Therefore, gj n (x) = Vj n (x) for almost all x E Sj, j E J , 
and for all n > n^. From here on, we consider only n > uq. 

We fix < Ei — > as / — > oo and take a compact set Wi C <Sj such that for each 
j E J , Wj is bounded and continuous on W/ and for s E {1,2}, 

(6.32) / w S j(x)dx -> as I -> oo. 

We can choose such Wi following the arguments in the proof of Lemma 6.1 of GMZ 
because Wj is integrable by Assumption l(ii). Take Mij,vij > 0, j = 1,2, • • •, J, such 
that for Cij = [-Mij + v hj , M hj - v u } d , 

P{X l ER d \C u }>0, 
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and for some Borel Aij C Cij H Wi, pj(-) is bounded and continuous on Aij, 
(6.33) sup \pj n (x) — pj(x)\ —> 0, as n — > oo, and 

/ pj(x)iVj(x)dx — > 0, as / — > oo, for s G {1,2}. 

JWi\A ld 

The existence of Mjj, i>y and £j and the sets Aij are ensured by Lemma Al. By Assump- 
tion l(i), we find that the second convergence in (I6.33P implies that J Wl \ Al Wj(x)dx —> 
as / — > oo, for s G {1,2}. Now, take Ai = nj =1 Ai t j and Ci = f}j =1 Cij, and observe that 
for s G {1,2}, 



w s j {x)dx < V / 
'm\Ai j=l Jm\Ai,j 



as / y oo for all j G 
First, we write 

EU{ nP/2h(p ~ 1)d/2T ^n)-a 3n } 



(6.35) 



n p/2 h (p~l)d/2 



n p/2 h {p~l)d/2 



On 

n p/2 h (p-l)d/2 



{Cn(*W) - ECn(Af\W,)} 
{Cn(WAA)-ECn(WAA)} 



0"n 

Since Af\A, = (Af\W { ) U (W^A), by Lemma A8, (16321) . and (16341) . 

(6.36) n p/2 /i (p - 1)d/2 {C„(#V4,) - ECn(^f\A)} A 0, as k ^ oo, and / oo. 

Furthermore, we write |cx 2 — cr 2 (Ai)| as 
J J r 

K I / \ / -. /\\T)/\T)/\ / \ / \ l 

X 

j=i fc=i 
J J 

^ /\T)/\T)/\l//-. ^ / \ \ / \ / \ 1 

X 

J 

Wj(x)wk(x)dx. 



j=l fc=l 

J J „ 

< VV sup Ityfc^aO^aOpLfc)! / (l-lA,(a;))wj(a;)wfc(a;)d 

J J r 

= $^$^ sup \qjkA x )f?jnk x )f$J< x )\ I 



3 ' 
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Observe that as / — > oo, 



Wj(x)wk(x)dx 

X\Ai 



X\A t 3 J \JX\Ai 



< \ w 2 (x)dx / w 2 k (x)dx 0. 



From Lemma A4, it follows that 

(6.37) lim^ojimsup^^ |cr 2 - cr 2 (A z )| = 0. 

Furthermore, since a 2 (Ai) — > <j 2 (Ai) as n — > oo for each / by Lemma A7, and cr 2 (Ai) — > 
a 2 > as / — > oo, by Assumption 1, it follows that for any E\ G (0, cr 2 ), 

(6.38) < cr 2 — Si < liminf^oocr 2 

< limsup^^cr 2 < a 2 + e x < oo. 

Combining this with ( I6.36p . we find that as n — > oo and I — > oo, 

p/2u(p-l)d/2 

{C„(*\A) - VCn(X\A t )} = o P (l). 

As for the last term in (I6.35p . by (I6.38P and Lemma A10, as n — > oo and I — > oo, 

n p/2 h ( P -l W 2 lUAi) _ ECn(A)| = Qp(l) 

Therefore, by (16.371) . 

p/2 h (p~l)d/2 

-{C„(4)-EC»(4)} 



n p/2 h {p-\)d/2 



{Cn(^)-EC„(^)} + P (l) 



(T n (Ai) 

where op(l) is a term that vanishes in probability as n — > oo and I — > oo. For each / > 1, 
the last term converges in distribution to A^(0, 1) by Lemma A10. Since a 2 (Ai) — » a 2 as 
n — > oo and / — )• oo, we conclude that 

£ {n^h^%(g jn ) - a jn } A N (0, a 2 ) . 



6.3. Proofs of Other Theorems. We now give proofs of other theorems in the paper. 
Proof of Theorem 2 : We first show that for each j e J, 

(6.39) d jn = a jn + P (n~ 1/2 h- 3d/2 ) and 

a 2 = a 2 +0 P (n^ 2 h-^ 2 ). 
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For this, we show that for all j, k — 1, ■ ■ -, J, 



(6.40) 

Write sup xeS . nSk \p jk , n (x) - p jk ,n( x )\ as 



SUp \pjk,n{ X ) ~ Pjk,n{ X ) 



P {n^ 2 h~ d ) . 



sup 



nh d 



i=l 



x - Xj 
h 



E 



x - Xj 
h 



Let (p n , x {yi, V2, z) = y 1 y 2 A' 2 ((a; - z)/h) and K, n = {(p n , x {-, v ):iG Sj fl We define 
N(e, JC n , L 2 (Q)) to be a covering number of JC n with respect to L 2 (Q), i.e., the smallest 
number of maps cfj, j = 1, ■ ■ ■, N±, such that for all <p G /C n , there exists <pj such that 
f(<Pj — <^>) 2 dQ < e 2 . By Assumption 2(b), Lemma 2.6.16 of van der Vaart and Wellner 
(1996), and Lemma A.l of Ghosal, Sen and van der Vaart (2000), we find that for some 
C > 0, 

sup log N(E,K n ,L 2 {Q)) < Cloge, 
Q 

where the supremum is over all discrete probability measures. We take <fi n (yi, 1/2, z) = 
7/!2/ 2 1 1 K\ 1^ to be the envelope of K n . By Theorem 2.14.1 of van der Vaart and Wellner 
(1996), we deduce that 



n 



l/2 h d B 



SUp \Pjk,n{ X ) - PjkA X ) 



<c, 



for some positive constant C. This yields ( 16.40p . In view of the definitions of aj n and 
a 2 , and Lemma A4, this completes the proof of (16. 39ft . 

Since gj(x) < for all x G X under the null hypothesis and K is nonnegative, 

sup ~Eg~j n (x) = sup / g,j(x — uh)K (u) du < I sup gj(x — uh)K (u) du 

xdSj xdSj J J xeSj 

< / sup gj(x)K (u) du = swpgj(x) < 0, 

J x£X x&X 

from some large n on. The second inequality follows from Assumption l(iii). Therefore, 



Ap(gjn(x))wj(x)dx < / A p (g jn (x) -Eg jn (x))wj(x)dx. 

' X J X 

Hence by using this and (16.391) . we bound P{T n > Zi- a } by 

l (x))wj(x)dx - a jn > > z X - a \ + o(l). 



P { ^ E [n pl2 h^ 1)d/2 j x k v {g jn {x) - Eg jn ( 



By Theorem 1, the leading probability converges to a as n — > 00, delivering the desired 
result. ■ 
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Proof of Theorem 3: Fix j such that Tj(gj) > 0. We focus on the case with p > 1. 
The proof in the case with p — 1 is simpler and hence omitted. Using the triangular 
inequality, we bound \Tj(gj n ) — Tj(gj)\ by 

{A p (g jn (x)) - A p (Eg jn (x))} Wj(x)dx 
{A p (Eg jn (x)) - A p (g j (x))}w j (x)dx 



A' 



.V 



There exists n such that for all n > n , sup xgiSj |E^ jri (a;)| < oo by Lemma A4. Also, 
note that sup^gs^. |<?j(x)| < oo by Assumption l(i). Hence, applying Lemma A3, from 
some large n on, for some C\,Ci > 0, 

\rj(g jn ) -Fjigj)] < Ci / \9jn(x) -~Eg ]n { x )\ P ~ kZw j( x ) dx 

l — n J X 



k=0 
fp-11 



+C 2 / |E^-„(x)) - g j (x)\ p - kz w j (x)dx, 
k=0 "* x 



where z = (p — 1)/ \p — 1] . Observe that < z < 1. 

As for the second integral, take £ > and a compact set -D C R rf such that 
f x ^ D Wj(x)dx < e and ^ is continuous on D. Such a set D exists by Lemma Al. 
Since D is compact, gj is in fact uniformly continuous on D. By change of variables, 



E 9jn(x) -gj{x) 



l/2,l/2] d 



{gj(x — uh)K{u) — gj(x)} du 



l/2,l/2] d 



{gj(x — uh) — gj(x)} K{u)du 



and obtain that for k — 0, 1, ■ ■ ■, p — 1 

\p—kz 



Eg jn (x) - gj(x) \ p ' z Wj(x)dx 
/ \Eg jn (x) - gj(x)\ p ~ kz w j {x)dx + / \Eg jn (x) - gj{x)\ p ~ hz w j {x)dx 

JD J X\D 



i X\D 

< C 3 sup sup (^(x — uh) — gj(x)\ 
«e[-i/2,i/2] d xeDnSj 



p—kz 



+c 4 



\gj(x — uh) — gj(x)\ p kz Wj(x)dudx, 

I X\D J[-l/2,l/2] d 

for some positive constants C3 and C4. Note that the constant C4 involves | |i^| |oo. The 
first term is o(l) as h — > 0, because gj is uniformly continuous on D. By Assumption 
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l(i), the last term is bounded by 



C5 / Wj(x)dx < CqE, for some C$,Cq > 0, 

JX\D 

for some large n on. Since the choice of e was arbitrary, we conclude that as n — > 00, 

\ r j(9jn) ~ Fj(9j)\ < Ci I \9jn(x) - Eg jn (x)\ p ~ kz Wj(x)dx + o(l). 

J x 

As for the leading integral, from the result of Theorem 1 (replacing A p (-) there by | ■ \ p ~ kz ), 
we find that 



\9j 



n (x) - Eg jn (x)\ p - kz Wj (x)dx = 0p {n^ kz ^ 2 h^ kz - l ^ d l 2 ). 



x 



Since n l l 2 h d l 2 — > by the condition of the theorem, we conclude that Tj(gj n ) A 
Tj(gj). Using the similar argument, we can also show that 

a 2 n -4 a 2 and a jn = P (h~ d/2 ) for all j 6 J, 

where a 2 = l'Sl > 0. Hence 

tfVj&n) ~ nr p ' 2 h- pd ' 2 h d ' 2 a jn } A a- 1 ^) > 0. 

Therefore, 

P{f n > z^ a } > P {a^Vjigj) > 0} + o(l) -»• 1, 
where the inequality holds by the fact that n~ x l 2 hr d l 2 — > and a 3n = Op(h~ d ^ 2 ). ■ 

Lemma All: Suppose that Assumptions 1-3 hold, n~ 1 / 2 h~ d — > 0, and that y/rigj(-) = 
Sj(-), j G J, for real bounded functions 5j, j G JT, for each n. Then, 

- ]T {n^h^ d /%, 5 (g jn ) - ~a jn } A A(0, 1), 
° n 3=1 

where a jn = f BAp(h- d ^ p jn (x)Z 1 + h^ x V^8jJx))wj{x)dx and S jn (x) = J 5 3 {x - 
uh)K(u)du. 

PROOF: By change of variables, 

y/nEg 3n (x) = \/n J g 3 (x — uh)K{u)du = J Sj(x — uh)K{u)du. 
Since S 3 is bounded, sup,^. y/n\Egj n (x)\ = 0(1). Hence 
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under the local alternatives. Using this and following the proof of Lemma A7, we find 
that under the local alternatives, <Jjk,n — > &jk- Also, as in the proof of Theorem 1, we 
use (16.411) and deduce that 

1 J 

(6.42) - V rpPh^w* {r.&g - ET^)} A N(0, 1). 

Now, as for n p ' 2 h^ d / 2 a~ 1 'Er j {g jn ), We first note that 

nP /2 h ( P -i) d /2 TjCg . n) = h -^ T ^/W{g 3n - Eg jn } + n^h^Eg^) 

= Tjih-VMpfrix^x) + h^ d ^5 jn (x)). 

We follow the proof of Lemma A4 and Lemma A6 (applying Lemma A2 with A p (v) in 
Lemma A6 replaced by A p (v + /^(p- 1 )/*^) Sj n (x) / pj n (x))) to deduce that 

J {n^ 2 h^ d / 2 EA p (g jn (x)) - EA p (Z jn (x))} Wj (x)dx 0, 

where Z jn (x) = h~ d ^ 'p jn (x)Z 1 + h d ^/^5 jn (x). ■ 



Proof of Theorem 4: Under the local alternatives, by f !6.39p and ( I6.42p . 

(6.43) P{f n > Zl _ a } 

= Pia-^in^h^-^T^) - a jn } > z x . a ) 

= Pia^E^in^h^-^Tjig^) - a jn + a jn - a jn } > z x _ Q } + o(l) 

= P{Zi + (J-^^iajn - a jn }) > z x _ Q } + o(l). 

Fix e > and take a compact set A e C Sj such that f s .\ A Wj(x)dx < e. Furthermore, 
without loss of generality, let A e be a set on which Sj(-) and pj(-) are uniformly continu- 
ous. Then for any e\ > 0, there exists A > such that sup^ gR d.|| x _ 2 || <A \5j(z)—Sj(x)\ < E\ 
uniformly over x G A s . Hence from some large n on, 

SUp \Sj n (x) — Sj(x)\ < / SUp \6j(x — uh) — Sj(x)\ K(u)du < E\. 

x€A e 7[-l/2,l/2] d xeA e 

Since the choice of E\ was arbitrary, we conclude that \Sj n (x) —6j(x)\ —> uniformly over 
x G A £ . Similarly, we also conclude that \pj n (x) — Pj(x)\ —> uniformly over x G A £ . 
Using these facts, we analyze cr _1 E^ =1 {aj n — dj n } for each case of p G {1,2}. 
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(i) Suppose p — 1. For 7 > and \i G R, 

Emax^Zi + p, 0} = EfyZi + /i^Zx + fi > 0]P { 7 Zx + \i > 0} 

= {// + 7 0(-/i/7)/(l " $(-M/7))} (1 " $(-/V7)) 
= /i(l-$(- / u/ 7 ))+ 7 0(-/i/ 7 ) 

= ^Wt) + 70Wt)- 
Taking 7 jn = h~ d ^ 2 pj n (x) , we have 

E max{7 in Zi + ^- n (x), 0} - E max{7 jri Zi, 0} 
= 5 jn (x)$(5 jn (x)/i jn ) + 7 jn <f)(5 jn (x)/7 jn ) - 7jn0(O) 
= ^ n ,(x)$(0) + O(/i d / 2 ), 

uniformly in x G <Sj. Therefore, we can write lim n _ 5 . 00 {a :) ' ri — cij n } as 

lim / E[A 1 (h^ d/2 p jn (x)Z 1 + 5 jn (x)) - A 1 (hr d/2 p jn (x)Zi)]w j (x)dx 
n ^°° Jx 

If If 

= - 5j(x)wj(x)dx H — lim / <5j n (£)ta,(a;)<£c. 

Since <5j n is uniformly bounded, there exists C > such that the last integral is bounded 
by Ce. Since the choice of e > was arbitrary, in view of (16.431) . this gives the desired 
result. 



(ii) Suppose p = 2. For 7 > and // G R, 

Emax{7Z! +/i,0} 2 = EffrZi + /i) 2 I7Z1 + // > 0]P {7Zi + // > 0} 

= (/i 2 + 7 2 )$(/V7) + m^h)- 

Taking 7 jri = h~ d ^pj n (x) and = h d ' A 8j n {x), we have 

E max{ r yj n Z 1 + pj n , 0} 2 — E max{7 jn Z 1 , 0} 2 

<t>(Pjnhjn) 

= {^„W(0) + 0(^ 2 )} + 0(/^ 2 ) + { W in0(O) + 0(/l d )} 

= 2<j}(Q)8j n (x)p jn (x) + 0(h d/2 ), uniformly in x G Sj. 
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Hence we write lim n _ s>00 {aj n — aj n } as 

lim f E[A 2 (h~ d ^p jn (x)Z 1 + h d /% n (x)) - A 2 (h~ d / A p jn (x)Z 1 )]w 3 (x)dx 



20(0) lim / 6 jn (x)p jn (x)wj{x)dx + 0(h d/2 ) 



lim / 5j n (x)pj n (x)wj(x)dx + \ — lim / 5j n (x)pj n (x)iUj(x)dx + 0(h d ^ 2 ). 

IX n^oo J A ^ V TV n^oo J 

The second term is bounded by Ce for some C > 0, because Sj n pj n is bounded. Since 
the choice of e > was arbitrary and 



5j n (x)pj n (x)wj(x)dx — > / 5j(x)pj(x)iVj(x)dx, as n — > oo, 
in view of ( 16.4311 . this gives the desired result. ■ 



Proof of Theorem 4*: Let A e c Sj be defined as in the proof of Theorem 4. 
(i) Suppose p — 1. Under take 7 = h~ d l 2 pj n (x) and p = h~ d ^Sj n (x) to get 

Emax{7Zi + /i, 0} - Emax{7Z 1) 0} 

= h- d /% n (x)<S>(h d /% n (x)/p jn (x)) + hr d l 2 p jn {x) [cj){h d l% n {x)/p jn {x)) - 0(0)] 

= h- d '% n {x)m + 1^(0) K„(x)/ Pjn (x)] + 0{h d l% 
uniformly in x G Sj. Therefore, if r)i.o(w,5) = under Hg, we can write \im n ^ > . 00 {dj n — 




= / [ S "j( x )/Pj( x )\ Wj(x)dx + -(f)(0) hm / [5j n (x)/p jn (x)] Wj(x)dx. 

Since S 2 n / pj n is uniformly bounded and the choice of e > is arbitrary, we get the 
desired result. 
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(ii) Suppose p — 2. Under H$, we take 7 = h~ d ^pj n (x) and (i = 5j n (x), so that, by a 
Taylor expansion, 

Emax{7Zi + (i, 0} 2 — Emax{7Zi, 0} 2 

= 7 2 {$(/V7) " $(0)} + /i 2 $(/i/7) + 

0(O)/i 7 + jVv)^} + { $ (oy + #0 y} + vn {^(°) + ^'V)^ 

= /i""/ 4 • 2<j>(0)S jn (x)p jn {x) + U%{x) + 0(/^ 4 ), 

uniformly in x G <Sj, where a* denotes a term that lies between and /i/7. Therefore, if 
771,1 (m, 5) = under if 2 <5, then we can write \im n ^. 00 {aj n — aj n } as 

lim / E[A 2 (/i- d/2 p,- n (x)Z 1 + /i- d/ % n (x)) - A 2 {h~ d/2 p jn (x)Z 1 )]w j (x)dx 
n ^°° Jx 

= — lim / 5 2 n (x)wAx)dx 



- f 8 2 {x)wAx)dx H — lim / 5 2 „fx)io,(x)<i:E. 
2J Ae 3 2^ooL> 4 ^ ^ 



Since 5 2 n is uniformly bounded and the choice of e > is arbitrary, we get the desired 
result. ■ 



Proof of Theorem 5: Similarly as before, we fix e > and take a compact set 
A e C Sj such that J S ,^ A Wj(x)dx < e and Sj(-) and 5j(-)pJ (■) are uniformly continuous 
on A e . By change of variables and uniform continuity, 

sup \S jn (x)pj^(x) - 5 j (x)pj 1 (x)\ -»■ and 



sup \Sj n (x) — 5j(x)\ 0. 

(i) Suppose p — 1. For 7 > and /i e R, 

E | 7 Zi +p\= 2 7 0(/i/ 7 ) + 2(i Mph) - 1/2] . 

With 7j n = ft, d ^ 2 pj n (x) aud /ij n = /i d ^Sj n (x), we find that uniformly over x £ Sj, 

E|7j„Zi + yu jn | - E|7 jn Zi| 
= 2^ jn [(j){(ij n /^ jn ) - (f)(0)] + 2(i jn [$((i jn /^ jn ) - 1/2] 
= [0"(O) + 20(O)]<5 2 n (x)pT ri i( x) + O(/ i d / 4 ). 
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Therefore, we write \im n ^ OQ {dj n — aj n } as 

lim I E[A 1 (h~ d/2 p jn (x)Z 1 + n- d/4 5 jn (x)) - A 1 (h~ d/2 p jn (x)Z 1 )]w j (x)dx 



n— >oo 



lim / S 2 n (x)pj^(x)w J (x)dx + 0(h d/4 ) 



2tt Js 
1 f 

5 2 (x)pj 1 (x)wj(x)dx H — j= lim / 5? {x)p~^{x)wj{x)dx + o(l). 



By Assumption 4 and Lemma A4, 5 2 n (x)pj^(x) is bounded uniformly over x E Sj, 
enabling us to bound the second integral by Ce for some C > 0. Since e is arbitrarily 
chosen, in view of (16.431) . this gives the desired result. 

(ii) Suppose p = 2. We have, for each x G Sj, 

E{h- d / 4 p jn {x)Z 1 + 5 3n (x)} 2 - E{h~ d / A Pjn (x)Z 1 } 2 = S 2 n (x). 
Therefore, we write lim^oo-fa^, — aj n } as 

lim / E[A 2 (h- d / A p 3n (x)Z 1 + 5 :jn (x))-A 2 (h- d ^p jn (x)Z 1 )}w j (x)dx 



J JX 

S 2 (x)wj(x)dx + lim / 5j n (x)wj(x)dx + o(l) 



The second integral is bounded by Ce for some C > 0, and in view of (I6.43p . this gives 
the desired result. ■ 
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Figure 1. Results of Monte Carlo Experiments: L\ test and a(x) = 1 

Uniform weight ond sample size = 50 Inverse S.E. weight and sample size = 50 



o 

o 
o 




DGPs DGPs 

Notes: 8 different solid lines in each panel correspond to our test with 
8 different bandwidth values. 2 dotted lines correspond to the test 
of Andrews and Shi (2011a) with PA and GMS critical values. The 
nominal level for each test is a = 0.05. There are 1000 Monte Carlo 
replications in each experiment. 
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Figure 2. Results of Monte Carlo Experiments: L\ test and a{x) = x 
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Uniform weight ond sample size = 50 Inverse S.E. weight and sample size = 50 




DGPs DGPs 



Uniform weight and sample size = 200 Inverse S.E. weight and sample size = 200 




DGPs DGPs 



Uniform weight and sample size — 1000 Inverse S.E. weight and sample size — 1000 




Notes: See notes in Figure [U 
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Figure 3. Results of Monte Carlo Experiments: 



L 2 test and a(x) = 1 



Uniform weight ond sample size = 50 Inverse S.E. weight and sample size = 50 
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Figure 4. Results of Monte Carlo Experiments: L 2 test and a(x) = x 

Uniform weight ond sample size = 50 Inverse S.E. weight and sample size = 50 




Uniform weight and sample size = 200 



Inverse S.E. weight and sample size = 200 



5 




Uniform weight and sample size — 1000 



Inverse S.E. weight and sample size = 1000 




Notes: See notes in Figure [U 



58 



LEE, SONG, AND WHANG 



References 

[1] Anderson, G., O. Linton, and Y.-J. Whang (2012): "Nonparametric estimation and inference 

about the overlap of two distributions," Journal of Econometrics, forthcoming. 
[2] Andrews, D. W. K. (2011): "Similar-on-the-boundary tests for moment inequalities ex- 
ist, but have poor power," Cowles Foundation Discussion Paper, No. 1815, available at 
http : //cowles . econ . yale . edu/P/cd/dl8a/dl815 .pdf 
[3] Andrews, D. W. K. and X. Shi (2011a): "Inference based on conditional mo- 
ment inequalities," Cowles Foundation Discussion Paper, No. 1761R, available at 
http : //cowles . econ . yale . edu/P/cd/dl7b/dl761-r . pdf 
[4] Andrews, D. W. K. and X. Shi (2011b): "Nonparametric inference based on condi- 
tional moment inequalities," Cowles Foundation Discussion Paper, No. 1840, available at 
|http : //cowles . econ . yale . edu/P/cd/d!8a/dl840 . pdf [ 
[5] Armstrong, T. B. (2011): "Asymptotically Exact Inference in Conditional 
Moment Inequality Models," Working Paper, Stanford University, available at 
http : //www . Stanford . edu/~timothya/ 
[6] Bai, J. (2003): "Testing parametric conditional distributions of dynamic models," Review of Eco- 
nomics and Statistics 85, 531-549. 
[7] Beirlant, J., and D. M. Mason (1995): "On the asymptotic normality of L p -norms of empirical 

functionals," Mathematical Methods of Statistics 4, 1-19. 
[8] Biau, G., B. Cadre, D. M. Mason, and B. Pelletier (2009): "Asymptotic normality in 

density support estimation," Electronic Journal of Probability 14, 2617-2635. 
[9] Bickel, P. J. and M. Rosenblatt (1973): "On some global measures of the deviations of 
density function estimates," Annals of Statistics 1, 1071-1095. 
[10] Blum, J. R., J. Kiefer, and M. Rosenblatt (1961): "Distribution free tests of independence 

based on the sample distribution function, 11 Annals of Mathematical Statistics 32, 485-498. 
[11] Chernozhukov, V., S. Lee, and A. Rosen (2009): "Intersection bounds: 
estimation and inference," Cemmap Working Papers, CWP 19/09, available at 
|http : //www . cemmap . ac . uk / wps/ cwpl909 . pdf 
[12] Denis Chetverikov (2009): "Adaptive test of conditional moment inequalities," arXiv Working 

Papers, arXiv:1201.0167v2, available at |http : //arxiv . org/abs/1201 . 0167v2| 
[13] Chiappori, P. -A., B. Jullien, B. Salanie, and F. Salanie (2006): "Asymmetric information 

in insurance: general testable implications," Rand Journal of Economics 37, 783-798. 
[14] Claeskens, G. and I. VAN Keilegom (2003): "Bootstrap confidence bands for regression curves 

and their derivatives," Annals of Statistics 31, 1852-1884. 
[15] CSORGO, M. and L. Horvath (1988): "Central limit theorems for L p -norms of density estima- 
tors," Zeitschrits fur Wahrscheinlichkeitheorie und verwandte Gebiete 80, 269-291. 
[16] Delgado, M. A. and J. C. Escanciano (2011): "Conditional Stochastic Dominance Testing," 

Working Paper, Universidad Carlos III de Madrid and Indiana University. 
[17] Delgado, M. A. and J. C. Escanciano (2012): "Distribution-free tests of stochastic mono- 

tonicity," Journal of Econometrics, forthcoming. 
[18] Delgado, M. A. and W. Gonzalez Manteiga (2001): "Significance testing in nonparametric 
regression based on the bootstrap," Annals of Statistics 29, 1469-1507. 



TESTING FUNCTIONAL INEQUALITIES 



59 



[19] Delgado, M. A. and J. Mora (2000): "A nonparamctric test for serial independence of regres- 
sion errors," Biometrika 87, 228-234. 

[20] Devroye, L. and L. Gyorfi (1985): Nonparametric Density Estimation: The LI View, Wiley, 
New York. 

[21] DiBenedetto, E. (2001): Real Analysis, Birkhauser, New York. 

[22] Durot, C. (2003): "A Kolmogorov-type test for monotonicity of regression, " Statistics & Prob- 
ability Letters, 63, 425-433. 

[23] Einav, L., A. Finkelstein, and J. Levin (2010): "Beyond testing: empirical models of insur- 
ance markets," Annual Review of Economics 2, 311-336. 

[24] Fan, Y. and Q. Li (2000): "Consistent Model Specification Tests: Kernel-Based Tests Versus 
Bicrens' ICM Tests," Econometric Theory 16, 1016-1041. 

[25] Gao, J. and I. Gijbels (2008). "Bandwidth selection in nonparametric kernel testing," Journal 
of the American Statistical Association 103(484), 1584-1594. 

[26] Ghosal, S., A. Sen and A. W. van der Vaart (2000): "Testing monotonicity of regression," 
Annals of Statistics 28, 1054-1082. 

[27] Gine, E., D. M. Mason, and A. Y. Zaitsev (2003): "The Li-norm density estimator process," 
Annals of Probability 31, 719-768. 

[28] Hall, P., Huber,C, and Speckman,P.L. (1997): "Covariate-matched one-sided tests for the 
difference between functional means, " J. Amer. Statist. Assoc. 92, 1074-1083. 

[29] Hall, P. and I. Van Keilegom (2005): "Testing for monotone increasing hazard rate, "Annals 
of Statistics 33, 1109-1137. 

[30] Hall, P. and Yatchew, A. (2005): "Unified approach to testing functional hypotheses in semi- 
parametric contexts," Journal of Econometrics 127(2), 225-252. 

[31] Hardle, W. and E. Mammen (1993): "Comparing nonparametric versus parametric regression 
fits," Annals of Statistics 21, 1926-1947. 

[32] Horowitz, J. L. and V. G. Spokoiny (2001): "An Adaptive, Rate-Optimal Test of a Parametric 
Mean-Regression Model against a Nonparametric Alternative, " Econometrica, Vol. 69, No. 3, 599- 
631. 

[33] Horvath, L. (1991): "On L p -norms of multivariate density estimators," Annals of Statistics 19, 
1933-1949. 

[34] Hsu, Y.-C. (2011): "Consistent tests of conditional treatment effects" Working Paper, University 
of Missouri. 

[35] Khmaladze, E. V. (1993): "Goodness of fit problem and scanning innovation martingales," 
Annals of Statistics 21, 798-829. 

[36] Khmaladze, E. V. and H. Koul (2004): "Martingale transforms goodness-of-fit tests in regres- 
sion models," Annals of Statistics 32, 995-1034. 

[37] KOUL, H.L. AND Schick, A. (1997): "Testing for the equality of two nonparametric regression 
curves, " J. Statist. Plann. Inference 65, 293-314. 

[38] Koul, H.L. and Schick, A. (2003): "Testing for superiority among two regression curves, " J. 
Statist. Plann. Inference 117, 15-33. 

[39] Lee, S. and Y.-J. Whang (2009): "Nonparametric tests of condi- 

tional treatment effects," Cemmap Working Papers, CWP 36/09, available at 
http : //www . cemmap . ac . uk/ wps/ cwp3609 . pdf 



60 



LEE, SONG, AND WHANG 



[40] Liero, H., H. Lauter, and V. Konakov (1998): "Nonparametric versus parametric goodness 

of fit," Statistics 31, 115-149. 
[41] MANSKI, C. F. (2003): Partial Identification of Probability Distributions, Springer- Ver lag, New 

York. 

[42] Mason, D. M. (2009): "Risk bounds for kernel density estimators," Journal of Mathematical 
Sciences 163, 238-261. 

[43] Mason, D. M. and W. Polonik (2009): "Asymptotic normality of plug-in level set estimates," 
Annals of Applied Probability 19, 1108-1142. 

[44] PlNELlS, I. F. (1994): "On a majorization inequality for sums of independent random variables," 
Statistics and Probability Letters 19, 97-99. 

[45] Shergin, V. V. (1993): "Central limit theorem for finitely-dependent random variables," Journal 
of Mathematical Sciences 67, 3244-3248. 

[46] Song, K. (2009): "Testing conditional independence via Rosenblatt transforms," Annals of Sta- 
tistics 37, 4011-4045. 

[47] Stein, E. M. (1970): Singular Integrals and Differentiability Properties of Functions, Princeton 
University Press, Princeton. 

[48] Stute, W. (1997): "Nonparametric model checks for regression," Annals of Statistics 25, 613-642. 

[49] Stute, W., S. Thies, and L. Zhu (1998): "Model checks for regression: an innovation process 
approach," Annals of Statistics 26, 1916-1934. 

[50] Sweeting, T. J. (1977): "Speeds of convergence in the multidimensional central limit theorem," 
Annals of Probability 5 28-41. 

[51] Tallis, G. M.(1961): "The moment generating function of the truncated multi- normal distribu- 
tion," Journal of the Royal Statistical Society. Series B (Methodological) 23, 223-229. 

[52] Tripathi, G. and Y. Kitamura (2003): "Testing conditional moment restrictions," Annals of 
Statistics 31, 2059-2095. 

[53] VAN DER Vaart, A. W. and J. A. Wellner (1996): Weak Convergence and Empirical Processes, 
New York, NY, Springer- Verlag. 

Department of Economics, Seoul National University, 1 Gwanak-ro, Gwanak-gu, 
Seoul, 151-742, Republic of Korea, and Centre for Microdata Methods and Practice, 
Institute for Fiscal Studies, 7 Ridgmount Street, London, WC1E 7AE, UK. 

E-mail address: sokbae@gmail.com 

Department of Economics, University of British Columbia, 997 - 1873 East Mall, 
Vancouver, BC, V6T 1Z1, Canada 
E-mail address: kysong@mail . ubc . ca 

Department of Economics, Seoul National University, 1 Gwanak-ro, Gwanak-gu, 
Seoul, 151-742, Republic of Korea. 
E-mail address: whang@snu.ac.kr 



